Closed airhorns closed 3 years ago
This is odd... the successfully opened file is being queried as a directory:
4801 getdents64(6, 0xc0005da000, 8192) = -1 ENOTDIR (Not a directory)
Might you be able to strace all previous file descriptor names using 6 (or whatever the failing fd is in the new run)? If I had to guess, in incremental mode some goroutine thinks an old directory fd is still live.
Thanks for all of the detail in this issue description. Unfortunately I'm not quite sure what the problem is here (although I agree ENOTDIR
sounds potentially relevant). I'm going to add a verbose logging mode in the next release that might provide more information. When it's released, you'll be able to enable it with --log-level=debug
.
Edit: It's been released in version 0.11.3. Please try it out and let me know if it provides any additional information for this problem.
Super useful, thanks Evan!
Here's a couple examples of the resolution phase failing (for two different runs):
> debug: Resolving import "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts" in directory "/app" of type "entry-point"
note: The import "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts" is being treated as an absolute paths
note: Read 30 entries for directory "/app"
note: Attempting to load "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts" as a file
note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts"
note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts.tsx"
note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts.ts"
note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts.jsx"
note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts.js"
note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts.css"
note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts.json"
note: Attempting to load "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts" as a directory
note: Read 7 entries for directory "/app/packages/api/spec/services"
note: Failed to find file "/app/packages/api/spec/services/docs-search/index.tsx"
note: Failed to find file "/app/packages/api/spec/services/docs-search/index.ts"
note: Failed to find file "/app/packages/api/spec/services/docs-search/index.jsx"
note: Failed to find file "/app/packages/api/spec/services/docs-search/index.js"
note: Failed to find file "/app/packages/api/spec/services/docs-search/index.css"
note: Failed to find file "/app/packages/api/spec/services/docs-search/index.json"
note: Read 0 entries for directory "/app/packages/api/spec/services/docs-search"
note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts/index.tsx"
note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts/index.ts"
note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts/index.jsx"
note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts/index.js"
note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts/index.css"
note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts/index.json"
note: Read 0 entries for directory "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts"
> error: Could not resolve "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts"
> debug: Resolving import "/app/packages/api/spec/routes/render/utils.ts" in directory "/app" of type "entry-point"
note: The import "/app/packages/api/spec/routes/render/utils.ts" is being treated as an absolute paths
note: Read 30 entries for directory "/app"
note: Attempting to load "/app/packages/api/spec/routes/render/utils.ts" as a file
note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts"
note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts.tsx"
note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts.ts"
note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts.jsx"
note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts.js"
note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts.css"
note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts.json"
note: Attempting to load "/app/packages/api/spec/routes/render/utils.ts" as a directory
note: Read 1 entry for directory "/app/packages/api/spec/routes/render"
note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts/index.tsx"
note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts/index.ts"
note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts/index.jsx"
note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts/index.js"
note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts/index.css"
note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts/index.json"
note: Read 0 entries for directory "/app/packages/api/spec/routes/render/utils.ts"
> error: Could not resolve "/app/packages/api/spec/routes/render/utils.ts"
Here's the actual file:
root@ubuntu:~# ls -lA "/app/packages/api/spec/routes/render/utils.ts"
-rw-r--r-- 1 root root 2750 Apr 2 14:42 /app/packages/api/spec/routes/render/utils.ts
Very suspicious to me is the note: Read 1 entry for directory "/app/packages/api/spec/routes/render"
from that log -- there are 6 files in that directory.
root@ubuntu:~# ls -lA /app/packages/api/spec/routes/render/
total 32
-rw-rw-r-- 1 root root 7739 Apr 2 14:42 app-access.spec.ts
-rw-rw-r-- 1 root root 1255 Apr 2 14:42 client-side-app.spec.ts
-rw-rw-r-- 1 root root 630 Apr 2 14:42 docs-redirect.spec.ts
drwxrwxr-x 2 root root 4096 Apr 2 14:42 graphql
-rw-rw-r-- 1 root root 5705 Apr 2 14:42 login-logout.spec.ts
-rw-rw-r-- 1 root root 2750 Apr 2 14:42 utils.ts
For comparison, here's an example of a successful resolution from one of the same runs:
> debug: Resolving import "/app/packages/api/src/routes/edit/graphql/connections.ts" in directory "/app" of type "entry-point"
note: The import "/app/packages/api/src/routes/edit/graphql/connections.ts" is being treated as an absolute paths
note: Read 30 entries for directory "/app"
note: Attempting to load "/app/packages/api/src/routes/edit/graphql/connections.ts" as a file
note: Found file "/app/packages/api/src/routes/edit/graphql/connections.ts"
note: Read 9 entries for directory "/app/packages/api/src/routes/edit/graphql"
note: This import is under the effect of "/app/tsconfig.json"
note: Primary path is "/app/packages/api/src/routes/edit/graphql/connections.ts" in namespace "file"
It would appear that the file isn't being found at that lowest level of resolution in loadAsFile
.
Might you be able to strace all previous file descriptor names using 6 (or whatever the failing fd is in the new run)? If I had to guess, in incremental mode some goroutine thinks an old directory fd is still live.
It is always fd 6 and 6 is used for almost every read I see in the process doing. I think (but am not certain) that the esbuild code that reads files like tsconfig.json
is different than the code that reads files to resolve them, and that all resolutions including successful ones do that getdents call. I think that is
https://github.com/evanw/esbuild/blob/fb8681e23af9d41f77349a92b72a9cf8bc8bd432/internal/resolver/resolver.go#L1231 or https://github.com/evanw/esbuild/blob/c62d33f6df47f07f408dc115ef1fdfc998cf03f3/internal/bundler/bundler.go#L1301-L1302 but I am not really sure.
Here's the whole strace if you are curious.
Also, I am equally suspicious of the new environment that I am running in as I am of esbuild -- it is the thing that is different between the case that works fine for me and the case that doesn't. It may be some other system level limit or something I haven't configured properly so apologies if this is just my fault, hopefully we can teach esbuild to emit an error message that makes it clear who is at fault though!
Very suspicious to me is the note: Read 1 entry for directory "/app/packages/api/spec/routes/render" from that log -- there are 6 files in that directory.
The strace log mirrors that observation, albeit with a different directory from a different run.
Here's the log fragment for the directory query of /app/packages/api/src/services/app-graphql-api/meta
with some extraneous system calls removed:
4803 openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta", O_RDONLY|O_CLOEXEC) = 6
4800 tgkill(4799, 4803, SIGURG <unfinished ...>
4803 getdents64(6 <unfinished ...>
4800 <... tgkill resumed> ) = 0
4803 <... getdents64 resumed> , /* 1 entries */, 8192) = 40
4803 --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=4799, si_uid=0} ---
4803 rt_sigreturn({mask=[]}) = 40
4803 getdents64(6 <unfinished ...>
4803 <... getdents64 resumed> , /* 0 entries */, 8192) = 0
4803 close(6) = 0
Notice getdents64
had one directory entry reported before it was interrupted with SIGURG
, which presumably is from the Go runtime or third party library because I don't see it in the esbuild code base. After that signal, 0 entries
were reported indicating no more files in the directory. But it is notable that this same directory later has a number of files incorrectly queried as directories:
4801 openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetMetaSchema.ts", O_RDONLY|O_CLOEXEC) = 6
4801 getdents64(6, 0xc0005da000, 8192) = -1 ENOTDIR (Not a directory)
4801 openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetModelType.ts", O_RDONLY|O_CLOEXEC) = 6
4801 getdents64(6, 0xc0005e4000, 8192) = -1 ENOTDIR (Not a directory)
4807 openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetPageType.ts", O_RDONLY|O_CLOEXEC) = 6
4807 getdents64(6, 0xc00035a000, 8192) = -1 ENOTDIR (Not a directory)
4801 openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetRoleType.ts", O_RDONLY|O_CLOEXEC <unfinished ...>
4801 <... openat resumed> ) = 6
4801 fcntl(6, F_SETFL, O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 0
4801 getdents64(6 <unfinished ...>
4801 <... getdents64 resumed> , 0xc0005e8000, 8192) = -1 ENOTDIR (Not a directory)
4801 openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/HydrationPlan.ts", O_RDONLY|O_CLOEXEC <unfinished ...>
4801 <... openat resumed> ) = 6
4801 getdents64(6, 0xc0005ea000, 8192) = -1 ENOTDIR (Not a directory)
4801 openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/actionInputSDL.ts", O_RDONLY|O_CLOEXEC <unfinished ...>
4801 <... openat resumed> ) = 6
4801 getdents64(6, 0xc0005ec000, 8192) = -1 ENOTDIR (Not a directory)
4801 openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetAuthenticationMethodType.ts", O_RDONLY|O_CLOEXEC <unfinished ...>
4801 <... openat resumed> ) = 6
4801 getdents64(6 <unfinished ...>
4801 <... getdents64 resumed> , 0xc0005f0000, 8192) = -1 ENOTDIR (Not a directory)
4801 openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetFieldType.ts", O_RDONLY|O_CLOEXEC <unfinished ...>
4801 <... openat resumed> ) = 6
4801 getdents64(6 <unfinished ...>
4801 <... getdents64 resumed> , 0xc0005f6000, 8192) = -1 ENOTDIR (Not a directory)
4801 openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetGlobalActionType.ts", O_RDONLY|O_CLOEXEC) = 6
4801 getdents64(6 <unfinished ...>
4801 <... getdents64 resumed> , 0xc0005f8000, 8192) = -1 ENOTDIR (Not a directory)
So esbuild's directory resolving logic appears to be reasonable, but I suspect that the system was interrupted while interrogating the files in the directory and it may have given an incorrect result. So this might be a Go runtime issue related to interrupting getdents64
on Linux with SIGURG. Perhaps esbuild could defensively introduce code to invalidate the resolve cache for a (probable) file's parent directory in the presence of ENOTDIR.
Right. I think the reason those files are later queried is because they are passed in as entrypoints to the build explicitly -- the tool invoking esbuild does a glob for all the js/ts files on disk and passes them all in as entrypoints. So, that glob which uses fast-glob underneath discovered the files just fine.
Can you confirm that all the files in question existed in those directories prior to running esbuild?
I think for the glob to have found them they had to have existed, and to my knowledge there's nothing that would delete them while esbuild was running? Subsequent runs in the exact same place succeed as well so I don't think they are coming and going but hey -- this is really weird, who knows.
Here's a few more full straces of failed executions:
http://paste.ubuntu.com/p/JxQ9F4MrgS/ failed with
error: Could not resolve "/app/packages/api/src/services/edit-graphql-api/EditGraphQLContext.ts"
error: Could not resolve "/app/packages/api/src/services/edit-graphql-api/formatError.ts"
error: Could not resolve "/app/packages/api/src/__generated__/edit-resolvers.ts"
error: Could not resolve "/app/packages/api/src/routes/test-support/scope.ts"
http://paste.ubuntu.com/p/QY8PWyVwR7/ failed with
error: Could not resolve "/app/packages/api/spec/fixtures/createFixture.ts"
error: Could not resolve "/app/packages/api/spec/fixtures/fixtureLoader.ts"
error: Could not resolve "/app/packages/api/src/services/app-database/DatabaseOperation.ts"
error: Could not resolve "/app/packages/api/src/services/app-database/FieldManipulator.ts"
error: Could not resolve "/app/packages/api/src/services/app-database/InternalValidateRecord.ts"
http://paste.ubuntu.com/p/8ksHkVrD9N/ failed with
error: Could not resolve "/app/packages/api/src/routes/test-support/scope.ts"
error: Could not resolve "/app/packages/api/src/__generated__/edit-resolvers.ts"
error: Could not resolve "/app/packages/api/src/workers/AppDependenciesSetup.ts"
error: Could not resolve "/app/packages/api/src/workers/CreateUniqueIndexConverges.ts"
error: Could not resolve "/app/packages/api/src/workers/GadgetQueue.ts"
In the first run there this is the strace snippet for listing of the directory the missing files are in
4443 openat(AT_FDCWD, "/app/packages/api/src/services/edit-graphql-api", O_RDONLY|O_CLOEXEC <unfinished ...>
4443 <... openat resumed> ) = 6
4443 epoll_ctl(3, EPOLL_CTL_ADD, 6, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=3797217304, u64=140706925834264}} <unfinished ...>
4443 <... epoll_ctl resumed> ) = -1 EPERM (Operation not permitted)
4443 epoll_ctl(3, EPOLL_CTL_DEL, 6, 0xc0002ae524 <unfinished ...>
4443 <... epoll_ctl resumed> ) = -1 EPERM (Operation not permitted)
4443 getdents64(6 <unfinished ...>
4443 <... getdents64 resumed> , /* 1 entries */, 8192) = 24
4443 --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=4439, si_uid=0} ---
4443 rt_sigreturn({mask=[]} <unfinished ...>
4443 <... rt_sigreturn resumed> ) = 24
4443 getdents64(6 <unfinished ...>
4443 <... getdents64 resumed> , /* 0 entries */, 8192) = 0
4440 nanosleep({tv_sec=0, tv_nsec=20000}, <unfinished ...>
4443 close(6) = 0
and for the second run
4537 openat(AT_FDCWD, "/app/packages/api/spec/fixtures", O_RDONLY|O_CLOEXEC) = 6
4537 epoll_ctl(3, EPOLL_CTL_ADD, 6, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=350822616, u64=140218148135128}} <unfinished ...>
4537 <... epoll_ctl resumed> ) = -1 EPERM (Operation not permitted)
4537 epoll_ctl(3, EPOLL_CTL_DEL, 6, 0xc00022e524 <unfinished ...>
4537 <... epoll_ctl resumed> ) = -1 EPERM (Operation not permitted)
4537 getdents64(6 <unfinished ...>
4537 <... getdents64 resumed> , /* 1 entries */, 8192) = 32
4537 --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=4534, si_uid=0} ---
4537 rt_sigreturn({mask=[]} <unfinished ...>
4537 <... rt_sigreturn resumed> ) = 32
4537 getdents64(6, /* 0 entries */, 8192) = 0
4535 <... nanosleep resumed> NULL) = 0
4535 nanosleep({tv_sec=0, tv_nsec=20000}, <unfinished ...>
Theres that SIGURG in both examples! If I look at some of the surrounding getdents64 calls I don't see that signal being sent.
It does seem like a Go runtime issue, barring some unusual storage device or background rsync-like process running.
Possibly related: https://github.com/golang/go/issues/39237
Do you still see the file errors with the workaround mentioned in https://github.com/golang/go/issues/39237#issuecomment-665739096 ?
export GODEBUG=asyncpreemptoff=1
🎉 That resolves the issue for me! Nice find @kzc , thank you!
I talked a bit with the folks from this other CI provider (LayerCI) and they're actually using a FUSE mounted filesystem under the hood to serve the files to esbuild using go-fuse
, which I think may suffer from the same issue itself. To me this helps confirm I am hitting the same issue linked to up there, just with a different FS than CIFS.
In any case, I think esbuild is not the culprit here so I will close this issue.
For anyone else coming along seeing strange resolve errors for files that seem like they do exist on disk, especially if running on top of a userspace filesystem via FUSE, try running esbuild with the GODEBUG=asyncpreemptoff=1
environment variable value set and see if your issues go away.
Even though it's a Linux kernel bug, I think this issue should be be reopened and warrants an esbuild workaround.
It's a difficult to diagnose problem that only occurs on Linux with Go applications on certain filesystems since the implementation of the SIGURG
based non-cooperative go routine preemption proposal. Applications written in other compiled or jitted languages including C, C++, Java, Rust, and JavaScript do not have this problem with getdents64
under normal circumstances with the file system in question. And as mentioned above, Go programs will not exhibit the getdents64
problem with the GODEBUG=asyncpreemptoff=1
environment setting.
The June 2020 linux kernel bug report hasn't generated any activity as of this writing. And the RedHat bug report seems to have been marked as CLOSED ERRATA.
I think esbuild could have a directory-specific resolve cache invalidation strategy in the event of ENOTDIR
in some situations as a defensive measure to workaround this getdents64
bug in the linux kernel.
One potential solution could be to force asyncpreemptoff
to 1 on Linux. It doesn't look like there's a Go API for this though, so that might have to be logged as a feature request for the Go team. Either that or the Go team might want to add a workaround for this on their end.
Regardless, it would be good to create another bug on the Go issue tracker. The issue https://github.com/golang/go/issues/39237 was closed even though the issue isn't fixed, and the last post says to create a new issue. I don't think it'd be appropriate for me to create one as I have no way of reproducing the problem myself.
@evanw I think there's enough information at hand for an esbuild workaround on Linux.
The Readdirnames
err
does not appear to be used?
The erroneously truncated entries
due to SIGURG
interrupting getdents64
on Linux could probably be detected here - and Readdirnames
could be retried.
I just ran into this issue using a monorepo with yarn workspaces. It causes a symlinked local dependency to be unresolvable.
When running a working esbuild config in a new environment, I get frequent but inconsistent
Could not resolve <file>
errors despite the files in question existing just fine.I'm trying to get an esbuild configuration that works locally on my OS X machine and in a Github Actions Ubuntu runner onto another Linux based CI system. The exact same config works fine always in the other environments, but only sometimes works in the new environment. I am building a big TypeScript project for node by passing ~200 or so
.ts
entrypoint files to the JSbuild
API.The files that esbuild fails to resolve change each time, but each time it does fail, it's all of the files in a particular folder that fail to resolve:
I don't have any resolve plugins set up and I am running on
v0.11.1
.Debugging
I tried adding a resolve plugin that runs an
fs.promises.access
on each file and then returns undefined to prove that the same process invoking esbuild can access the file fine, and it indeed logged successful accesses. Because of this and because the build works sometimes I think the files really are there.Curiously, strace-ing esbuild makes it fail to resolve some files almost every time I invoke it with strce. I don't see anything amiss in the strace, but here's an example trace for a file that fails to resolve:
To me it seems like opening the file succeeds just fine, and then esbuild writes the
error: Could ...
string to stderr.I was running
strace -f -e '!futex,read' yarn <my-esbuild-task>
which ends up running esbuild with these options:see https://github.com/gadget-inc/esbuild-dev/blob/main/src/Compiler.ts#L98-L106 for the source
For contrast, here's what the strace looks like for esbuild reading a file successfully early:
I realize this isn't a super reproducible issue, but I am hoping that these are enough breadcrumbs that we could figure out what next steps to take to unearth more information. Also notable is that this seems similar to #348 , but
ulimit -n
is 65536 in this environment, and I am running as root.Any other information I could gather to help figure out what might be causing this heisenbug?