Propshaft race condition with webpack creates 410 with dynamic chunk loading

joker-777 commented 2 years ago

Hi, first of all, thanks a lot for developing this new solution. We are using it in production and are quite happy. One issue we bump into quite frequently though is in development.

We use webpack and depend on dynamic chunk loading. That means that webpack needs to compile these chunks already with a hash and a ".digested" suffix. We also use LiveReload from webpack to refresh the page after each compilation automatically.

What we observed is that when we apply multiple changes in succession, then a dynamic chunk request might return a 410. We debugged the code and could see that the asset wasn't in the @cached_assets_by_path. We also saw that propshaft clears the cache when any of the files in app/assets/builds changes. We suspect that there is some kind of race-condition between webpack and propshaft, clearing the cache, compiling multiple times and refreshing the browser at the same time.

We also noted that the digested assets created by webpack accumulate over time which could also slow down the creations of cached_assets_by_path. This accumulation happens also in production and doesn't get cleared by assets:clear.

We tried to find a reproducible example but since it is likely a race condition it is very hard. We would appreciate any thoughts and pointers.

brenogazzola commented 2 years ago

Hello @joker-777, thanks for trying Propshaft and raising the issue.

I'm not familiar with LiveReload, but from Propshaft side, it adds an extra before_action to the application controller of your app, that checks for changes and clears the cache when you make a new request. I think its possible that with a large enough number of assets, a second reload is going to hit while the cache of the first reload is being refreshed and this might cause the race condition you are seeing. I'll see what I can do.

As for production, assets:clear will keep at least three versions of each asset, and all that were created in the last hour. Unless you are deploying changes that touch all assets multiple times per hour, it should not be a problem. Also: production uses a manifest file, so it does not read files and calculating their digests dynamically.

gingerlime commented 2 years ago

Hi @brenogazzola! Thank you! I work with @joker-777 and we investigated this issue together.

We would really appreciate if you can look into it, and we're also happy to try out any ideas. As we mentioned, reproducing it is quite tricky.

As for production, yes the manifest keeps track of versions of assets and clears them on assets:clear. However assets that have a .digested in the filename aren't tracked or cleared by propshaft. We currently run a manual script to try to clear those up, but it's a bit brittle, because there's no easy way to know what were the last 3 versions, and we have to rely on created or updated timestamps of the files on the filesystem. This is a crude tool (e.g. whatever the threshold we set for "old", if we don't deploy for longer than the threshold, we might risk pruning digested files that are still in-use).

brenogazzola commented 2 years ago

we're also happy to try out any ideas

My two working ideas are using a semaphore when cache is reloading (which is going to have an impact on performance), or figuring out how to cache bust only the modified files (which makes code more complex). Need to find time to test each of them.

However assets that have a .digested in the filename aren't tracked or cleared by propshaft.

You mean from builds or public/assets? The builds folder needs to be commited, but it's contents should be in .gitignore. As for public/assets, that's definitely a bug. There was a recent change related to .digested and we must have missed it. I'll check.

This is a crude tool

It is however how assets:clean work. It will be at least 3 versions of every asset, but as many as there were in the past 1 hour, all based in the mtime.

gingerlime commented 2 years ago

My two working ideas are using a semaphore when cache is reloading (which is going to have an impact on performance), or figuring out how to cache bust only the modified files (which makes code more complex). Need to find time to test each of them.

I'm not sure about the more complex cache-busting, but I'd give semaphore a try, at least to see if it solves the issue we're seeing. I'll keep you posted if I find anything. That's a great idea! Thank you @brenogazzola.

About the .digested files, sorry. I explained the problem all wrong. Let me try to explain again :)

With "normal" assets, let's say init.js, propshaft automatically creates a digest and stores the file with the digest. So on production, you might end up with init-aaaaa....js and init-bbbbb....js etc. Then the cleanup process will map init.js to the two digested files. This allows propshaft to then clean up older versions from the filesystem.

With ".digested" assets, propshaft gets an already-digested file, and doesn't apply another digest to it, so we might produce, e.g. loader-abcdef.digested.js and then during the next deploy loader-fedcba.digested.js. Propshaft does not extract the digest and logical part out of those files.

https://github.com/rails/propshaft/blob/390381548b125e8721c8aef1b9d894b7cc8bd868/lib/propshaft/output_path.rb#L55-L60

pry(main)> extract_path_and_digest("/app/loader-49b7fe393bf868bc53e952ba59a1120b037488edf92d60c70a9148ede3cdb326.digested.js")
=> ["/app/loader-49b7fe393bf868bc53e952ba59a1120b037488edf92d60c70a9148ede3cdb326.digested.js", nil]

pry(main)> extract_path_and_digest("/app/init-7324c15680c90f1782e2e2df9038c0c39b271a5d047745c84972e6f8635a4ddf.js")
=> ["/app/init.js", "7324c15680c90f1782e2e2df9038c0c39b271a5d047745c84972e6f8635a4ddf"]

This means that each digested file is a snowflake, and therefore never gets cleaned up, because there's only one current version of it. Our own crude tool is to just look for older digested files and delete them, but that's a foot-gun in many situations. I don't have a solution for this, but it's just something that we spotted as a limitation of the cleanup process when using digested assets. It also causes the cached assets to balloon over time in development.

Hope this makes sense? I'm not that familiar with propshaft yet, so I hope I didn't mess this up again :)

gingerlime commented 2 years ago

Yep! adding a mutex around the cache cleanup seems to solve the issue (fingers-crossed). I have very limited experience with concurrency in general, and basically zero with Ruby concurrency, but I simply wrapped the cleanup code in a mutex and it seems to do the trick


class Propshaft::LoadPath
# ...

  def initialize(paths = [], version: nil)
    @paths   = dedup(paths)
    @version = version
    @mutex ||= Mutex.new
  end

# ...

    def clear_cache
      @mutex.synchronize do
        @cached_assets_by_path = nil
        assets_by_path  # seeds the cache
      end
    end

tsrivishnu commented 1 year ago

This is a crude tool (e.g. whatever the threshold we set for "old", if we don't deploy for longer than the threshold, we might risk pruning digested files that are still in-use).

@gingerlime wouldn’t it be safer to parse the .manifest.json file and keep the files that are in there and delete all the rest?

joker-777 commented 1 year ago

@tsrivishnu This is an option but then it would only keep one version. Shouldn't it keep at least the versions of the previous deployment?

tsrivishnu commented 1 year ago

@joker-777 I think it's doable with these two options:

If you use Capistrano, it keeps (or can keep) backups of asset manifest in assets_manifest_backup in each release folder. I suppose, we could go through all the kept releases and use their backed up manifest. I see the following every time I deploy:

00:28 deploy:assets:backup_manifest
  01 mkdir -p /home/xxx/apps/xxx/releases/20230530111226/assets_manifest_backup
✔ 01 xx@app.stage.xxxx.xx 0.084s
  02 cp /home/xxx/apps/xxx/releases/20230530111226/public/assets/.manifest.json /home/xxx/apps/xxx/releases/20230530111226/assets_manifest_backup
✔ 02 xx@app.stage.xxxx.xx 0.071s

I suppose manifest.json is also versioned with a digest and kept. We could also look the last three or n and keep the files from those. I see these manifests in public/assets
```
manifest-dcca4ae2c9da08d55acf8dfdc7524cc89f328b72.json  
manifest-f1378d3c96438a5dee65f98c16c9233536e3f602.json
```

tsrivishnu commented 1 year ago

I think it's worth noting that in the option 2, the manifest json that is versioned and kept is different from the .manifest.json. I suppose those are generated by Webpack while .manifest.json is by propshaft. I haven't fully tested the options. But looks like option 2 can be used for pre-digested assets.

joker-777 commented 1 year ago

We don't use capistrano. Versioning the manifest file could be a solution though.

dhh commented 5 months ago

@gingerlime Could you try https://github.com/rails/propshaft/issues/110 and ensure that still works for you?

dhh commented 5 months ago

Re: production accumulation of digest chunks, is this an issue because you're using Heroku or something? With a containerized deployment, like Kamal, each container should be starting with a fresh build.

gingerlime commented 5 months ago

Thanks for following up on this. I’m no longer working on this codebase but @joker-777 should be able to comment on this issue.

joker-777 commented 5 months ago

@dhh Thanks, I will try to test this as soon as possible.

joker-777 commented 5 months ago

@dhh The link in your comment doesn't work, but I guess you meant version 0.9.0. We haven't seen this error for a long time. It may be because we don't use LiveReload anymore. But we will start using 0.9.0 from now on and let you know if we find any problems. Did you also find a solution for the *.digested.js files?

dhh commented 5 months ago

I did mean 0.9, yes. If you're talking about cleanup of digested files, I don't know that there is a straight shot there. It's also not a problem that affects modern containerized deployments, since they'll have a clean build every time.

rails / propshaft

Propshaft race condition with webpack creates 410 with dynamic chunk loading #110