dherault / serverless-offline

Emulate AWS λ and API Gateway locally when developing your Serverless project
MIT License
5.19k stars 796 forks source link

After code changes, old code is still used #1659

Open trygveaa opened 1 year ago

trygveaa commented 1 year ago

Bug Report

Current Behavior

I have a javascript/typescript project using serverless-offline. Previously after I made changes to a file and then made a call to a http endpoint, it would run the new code. After version 9 however, it's still running the old code. So I have to restart serverless-offline after every change I make, which of course makes development very slow.

Sample Code

https://github.com/trygveaa/serverless-offline-reload-bug

Steps to reproduce:

  1. Run npm run start in that repo
  2. Make a request to http://localhost:3000/dev/hello
  3. Change the return body in src/handler.js to something else.
  4. Make a new request to http://localhost:3000/dev/hello

It's important that you make the request in step 2, as it will pick up any changes done before the first request, but after one request is made, it will always run that code until it's restarted.

Expected behavior/code

When a request is made, the current code should always be run.

Environment

Additional context/Screenshots

I bisected the issue and found that it was introduced in commit 78fec17b. As you can see from the repo, I'm not using the allowCache option that was removed there, so there must be some unintended side effects by that commit.

mfang0126 commented 1 year ago

Got the same issue here too. The only thing I can do for now is rollback.

mfang0126 commented 1 year ago

@trygveaa Just add --reloadHandler. It works for me. I saw this option when I check it by using sls offline --help.

❯ sls offline --help
Running "serverless" from node_modules
(node:4684) NOTE: The AWS SDK for JavaScript (v2) will be put into maintenance mode in 2023.

Please migrate your code to use AWS SDK for JavaScript (v3).
For more information, check the migration guide at https://a.co/7PzMCcy
(Use `sls --trace-warnings ...` to show where the warning was created)
offline                         Simulates API Gateway to call your lambda functions offline.
offline start                   Simulates API Gateway to call your lambda functions offline using backward compatible initialization.
......
--noPrependStageInUrl           Don't prepend http routes with the stage.
--noTimeout / -t                Disables the timeout feature.
--prefix / -p                   Adds a prefix to every path, to send your requests to http://localhost:3000/prefix/[your_path] instead.
--reloadHandler                 Reloads handler with each request.
......
trygveaa commented 1 year ago

@mfang0126: Ah, thanks!

@dnalborczyk: Is there any reason this isn't the default behavior anymore?

I think it would be good to mention it in the changelog at least. Now it just says "remove allow cache option" under bug fixes and "add reload handler flag" under features, but nothing related under breaking changes and no mention that the default behavior has changed.

trygveaa commented 1 year ago

Hm, no, when --useInProcess is used, the --reloadHandler option doesn't help for me. The issue persists when using both of these options.

noppa commented 1 year ago

@trygveaa I don't think it's possible to combine in-process mode and reloading anymore after v9. From the changelog

handler reloading within the node.js process (in-process mode) was removed and is not possible anymore, as it was buggy and the cause for countless memory leaks

trygveaa commented 1 year ago

@noppa: Oh, thanks for the info. I had just looked in the CHANGELOG.md file where it doesn't say that. I didn't realize there was more info on the GitHub release page :/

That sucks though, as reloading the whole node process takes significantly longer than just the handler (and I would prefer reloading just the handler even if it has memory leaks, the extra memory usage is worth the reduced waiting). Personally, I used the previous reloading without any issues I can recall. Dropping --useInProcess is problematic because it has issues with debugging, as reported in #1655.

sergioflores-j commented 1 year ago

@trygveaa I agree with you. HMR is a must-have for serverless-offline, regardless of how it's being loaded. (In-Process or new Worker Threads)

I have a project using serverless-esbuild to rebuild my modules with watch mode. I'd expect serverless-offline to use the newest available code to run the handler.

Reproduction: With v8.x it works as expected: with-sls-offline-v8 With v12.x it does not: with v12

RichardBradley commented 1 year ago

I had the same problem after recently updating and eventually found this thread and https://stackoverflow.com/questions/74389507/serverless-offline-start-hot-reload-not-working

I expected watching for code changes to be a core feature of serverless-offline, so I didn't expect a non-default option to be required. I think this is a significant regression.

What does --reloadHandler do and why is it needed in newer versions of serverless-offline? The stackoverflow question I linked to above suggested that it might have a significant perf impact vs code watching, although I can't say that I've noticed a big difference on my large serverless + typescript project.

Is this being looked into? Is there anything I can do to assist?

noppa commented 1 year ago

I'm not the author but I think I may be able to provide some info, based on reading the changelogs and working to get our dev setup at work to work nicely with the newer versions of serverless-offline.

Originally, the way the reloading here worked was that when a file changes (or on every request? not sure), serverless-offline would just purge the Node.js require cache, and that's how the changed files would re-evaluate.

So bascally

// a.js: console.log("A")
require('./a.js') // logs A
// change a.js to: console.log("AAAAA") 
delete require.cache[require.resolve('./a.js')] // serverless-offline purges cache
require('./a.js') // logs AAAAA

I'm pretty sure they also didn't purge files under node_modules, because those rarely change and usually contribute a big percentage of an application's launch time.

Apparently this approach caused some memory leak issues. I can imagine that if you, for example, do something like

// a.js
setInterval(() => console.log("A"))

then even though a.js is purged from cache, there's nothing cleaning up the interval. And the interval is a pretty convoluted example here, any callback provided to any delayed handler would leak memory this way. I did see that serverless-offline would crash with out of memory exception if left running for too long, but personally I didn't mind that much. From the author's point of view though I can see that the bug reports for issues like that could be frustrating as they are pretty fundamental to the approach and not easily fixable.

So then that leads to the new approach - launching new worker threads for requests. Each new worker will not only load their own instances of fresh files from disk, but also gets callbacks like setInterval or whatever tightly coupled to the lifetime of the thread. So if the worker thread is terminated, then all those callbacks & memory is cleaned up too.

With worker threads, it's easy to reload fresh files from disk; just launch a new worker for the next request. However, if you lauch a new worker on every single request, then there's a problem: it can get quite slow. In the codebase where I'm working, a request going to a fresh new worker takes about 1500ms, whereas a request going to an existing "hot" worker takes about 150ms, so there's about 10x difference. Waiting 1.5s for one single request isn't too bad, but these things pile up when your client app makes tens of requests.
I assume that's the reason it was removed from defaults and the --reloadHandler flag was added to do this.

So currently the only available cache-invalidation strategy is to invalidate it on every request (so basically not have cache at all). In #1660, I requested an alternative: cache -invalidation strategy that would watch file changes and invalidate when that happens. As I mentioned there, doing that is now possible with another plugin, serverless-offline-watcher.

I guess there's also one other possible issue with the Worker thread approach. Apparently it causes debugging issues in some setups, like @trygveaa said. I can't speak to that, for me VSCode debugger works fine with the Workers too.

Anyway, I think it would be possible to create a custom plugin that invalidates the in-process Node.js cache, basically bringing back the old (somewhat hacky) behaviour, and then perhaps use serverless-offline-watcher to invoke that custom plugin when files change.

Something like

// serverless-offline-in-process-reload-plugin.js

export default class ServerlessOfflineInProcessReloadPlugin {
  hooks = {
    'offlineInProcessReload:reload': () => this.reload(),
  }

  reload() {
    for (const key of Object.keys(require.cache) {
      if (!key.includes('node_modules')) {
        delete require.cache[key]
      }
    }
  }
}

...or something along those lines... And then in serverless.yml

plugins:
  - serverless-offline
  - serverless-offline-watcher
  - ./serverless-offline-in-process-reload-plugin.js

custom:
  serverless-offline-watcher:
    - path:
        - src/**/*
      hook: offlineInProcessReload:reload
RichardBradley commented 1 year ago

Thanks for the very clear explanation.

This sounds like a serious regression for serverless-offline: I would think that local development, including hot reloading of changed files, is the main use-case for serverless-offline, and it's currently not working properly (by default) with the new worker thread approach. That seems like a much worse problem than memory leaks.

I'll watch here and #1660 for updates.

thetrevdev commented 10 months ago

If you are using webpack you can skip the watcher plugin and directly use this plugin to wire up the two hooks: webpack:compile:watch:compile -> offline:functionsUpdated

https://www.npmjs.com/package/serverless-webpack-offline-reload

serverless-esbuild has this built in as well.