broccolijs / broccoli-persistent-filter

MIT License
12 stars 33 forks source link

"unexpected end of file" can persist from corrupted build and there is no discoverable way to force rebuild #208

Open nicolekc opened 3 years ago

nicolekc commented 3 years ago

Hi all, I have been using this (unknowingly until yesterday 😉) through ember build.

Sometimes, the build (or ember serve) appears to hang and I will CTRL+C the process to get past it without waiting.

When that is done, sometimes a broccoli-persistent-filter cache write can be cut off in the middle of writing and then produces a corrupted result, and get an error like this:

Build failed.
Build Error (TemplateCompiler) in foo/bar/thing.js

unexpected end of file

The problem is that the key for this corrupted result is the full content of the original file and the file relative path:

  cacheKeyProcessString(string: string, relativePath: string) {
    return md5Hex(string + 0x00 + relativePath);
  }

So, if you make a small change to the file that reports as an error, then you can recompile. But if you change the file back (e.g. it's not a file you want to commit) then it retrieves the error result again!

I discovered finally by reading various issues in this module and stefanpenner/async-disk-cache (e.g. 1, 2) that is because the contents are not stored in /dist with the final build, they use os.tmpdir() plus my username, e.g.

/var/folders/4b/bpt40nwn4cqbtmkn34r0dy_40000gn/T/nicole

I think in https://github.com/stefanpenner/async-disk-cache/issues/35 the issue that was considered was whether this dir needed to be able to be deleted in case of growing unbounded, so there was some documentation added to https://github.com/broccolijs/broccoli-persistent-filter#warning:

On OSX, files that aren't accessed in three days are deleted from /tmp. On systems using systemd, systemd-tmpfiles should already be present and regularly clean up the /tmp directory. On Debian-like systems, you can use tmpreaper. On RedHat-like systems, you can use tmpwatch.

But my issue is different. In my case, I don't have a problem with out of date files (I'm on OS X), but rather with active files.

It seems that there is no mechanism to force a refresh unless you know about this step of the broccoli build process. This was very hard for me to discover and I only did by finally searching my node_modules for words that appeared in the build output (we are a large team with an enormous ember app, many thousands of files and over a thousand modules in node_modules).

I have fixed this by simply making the build relative to my project's root director with BROCCOLI_PERSISTENT_FILTER_CACHE_ROOT, but I'm raising this issue because I believe there are a couple general issues that should be fixed...

Things that would have fixed this issue for me:

  1. this module should handle the case of corruption by adding file mtime to the hash key (automatic fix — I observe the build file which is corrupted in the error output, I touch the file, and voila)
  2. and since there are some cases like this where the only recovery is a clean build, it's strange to me that the cache directory name is "if-you-need-to-delete-this-open-an-issue-async-disk-cache". Couldn't there be other cases where a fallback may need to be to delete the cache dir?
  3. and finally, on the point of #2, I think the cache dir should be made discoverable, at least by some build output. (see note*). This would help naive users like me (who don't even know ember uses broccoli to come find this module's readme) to discover the relevant information and try deleting the dir.

(I did finally notice, after all this discovery, that the error dump I was looking at, e.g. /var/folders/4b/bpt40nwn4cqbtmkn34r0dy_40000gn/T/error.dump was close* to the cache, but one level up, not including username. I had no way to guess that it was my username until I finally read the correct code, and since the file names are md5 hashes, and the contents are compressed, there is no way to use shell find or grep to identify the troublemaking file.)