Closed eharris369 closed 4 years ago
I tried deleting all of my projects and creating a new Open Liberty project. It flipped between Building: Creating image
and Build Failed: Creating image for ... failed
a few times and then finally ended up stuck in Build Queued state. There are no other projects. Here is the pfe log file just for this:
logs.txt
Deleted all projects again and tried a couple of other project types and they appear to be ok (python, spring, go).
Spoke too soon - the Spring project got stuck in build queued state even though it got all the way to running state first:
@eharris369 What are the specific reproduction steps for this? Does it happen upon a new PFE build with existing projects or does it happen on creation of any new project or happen on a update of an existing project which is up?
/assign /pipeline in progress
I am not 100% sure how I got there. I think I restarted Eclipse and already had a 2 projects there (mpA1 and node1) and then I tried to add an Open Liberty project and it got stuck in build queued state.
Is this is only happening on Eclipse? And when you restarted eclipse - was PFE up and were those projects already running?
If you could replicate the issue again, could you please upload the logs with trace mode enabled.
I can turn on trace mode and then if it happens again upload the logs.
Just reproduced it - wasn't that hard. Created a node project then an appsody microprofile project and then an open liberty project one after the other. Did not need to restart eclipse. Here is the log with trace on: logs.txt
What does the build status on those project say on the UI?
It looks like this:
I just tried the same thing on MAC and it did not have the problem so maybe Windows only?
I have noticed on the logs the running build queue looks like:
Running Builds queue: ["938ba370-487f-11ea-9c19-6b6b88d11df1","938ba370-487f-11ea-9c19-6b6b88d11df1","938ba370-487f-11ea-9c19-6b6b88d11df1"]
That is very weird to me. Something is kicking off 3 builds together and blocking the queue. I am not sure if portal does something repeatedly that calls the create function multiple time in windows.
Yes I can't reproduce on my machine (Mac) either
This part of the log is even more interesting:
[34m[06/02/20 01:28:32 mpA1] [TRACE][39m [Container] [INFO] Downloaded from central: https://repo.maven.apache.org/maven2/io/openliberty/features/microProfile-3.0/19.0.0.8/microProfile-3.0-19.0.0.8.esa (18 kB at 50 kB/s)
[Container] [INFO] Downloading from central: https://repo.maven.apache.org/maven2/com/ibm/websphere/appserver/spi/com.ibm.websphere.appserver.spi.containerServices/3.0.31/com.ibm.websphere.appserver.spi.containerServices-3.0.31.jar
[34m[06/02/20 01:28:32 mpA1] [TRACE][39m [Container] [INFO] Downloaded from central: https://repo.maven.apache.org/maven2/io/openliberty/features/com.ibm.websphere.appserver.appLifecycle-1.0/19.0.0.8/com.ibm.websphere.appserver.appLifecycle-1.0-19.0.0.8.esa (9.3 kB at 25 kB/s)
[Container] [INFO] Downloading from central: https://repo.maven.apache.org/maven2/io/openliberty/features/com.ibm.websphere.appserver.javax.cdi-1.2/19.0.0.8/com.ibm.websphere.appserver.javax.cdi-1.2-19.0.0.8.esa
[34m[06/02/20 01:28:32 mpA1] [TRACE][39m [Container] [INFO] Downloaded from central: https://repo.maven.apache.org/maven2/com/ibm/websphere/appserver/spi/com.ibm.websphere.appserver.spi.containerServices/3.0.31/com.ibm.websphere.appserver.spi.containerServices-3.0.31.jar (59 kB at 134 kB/s)
[34m[06/02/20 01:28:32 mpA1] [TRACE][39m [Container] [INFO] Downloading from central: https://repo.maven.apache.org/maven2/io/openliberty/features/com.ibm.websphere.appserver.javax.el-3.0/19.0.0.8/com.ibm.websphere.appserver.javax.el-3.0-19.0.0.8.esa
[36m[06/02/20 01:28:32 User.js] [DEBUG][39m Project 938ba370-487f-11ea-9c19-6b6b88d11df1 file change list received on 1581017194453. chunk number 1 with total chunk 1 for this timestamp.
[32m[06/02/20 01:28:32 /portal/modules/FileWatcher.js] [INFO][39m 938ba370-487f-11ea-9c19-6b6b88d11df1 update start time: 1580952512798
[32m[06/02/20 01:28:32 /portal/routes/projects/remoteBind.route.js] [INFO][39m Total time to sync project node1 to build container is 1580952512.798 seconds
[36m[06/02/20 01:28:32 /portal/modules/FileWatcher.js] [DEBUG][39m Portal's event listener: projectStatusChanged
[36m[06/02/20 01:28:32 /portal/modules/FileWatcher.js] [DEBUG][39m Portal's event listener: {"projectID":"938ba370-487f-11ea-9c19-6b6b88d11df1","buildStatus":"inProgress"}
[36m[06/02/20 01:28:32 /portal/modules/FileWatcher.js] [DEBUG][39m projectStatusChanged: inProgress (project 938ba370-487f-11ea-9c19-6b6b88d11df1)
[34m[06/02/20 01:28:32 /portal/modules/FileWatcher.js] [TRACE][39m projectStatusChanged: inProgress (project: {
"projectID": "938ba370-487f-11ea-9c19-6b6b88d11df1",
"buildStatus": "inProgress"
})
[36m[06/02/20 01:28:32 Turbine] [DEBUG][39m Found at least one build in the queue
[36m[06/02/20 01:28:32 Turbine] [DEBUG][39m Space available to trigger the next build
[36m[06/02/20 01:28:32 Turbine] [DEBUG][39m Metadata for next build: {"operation":{"operationId":"f0fca9ba5e79c494ac6c80f7a1301a3e","type":"update","projectInfo":{"projectID":"938ba370-487f-11ea-9c19-6b6b88d11df1","projectType":"nodejs","location":"/codewind-workspace/node1","autoBuildEnabled":true,"startMode":"run","appPorts":["3000"],"language":"nodejs","debugPort":"9229","ignoredPaths":["*/.idea/*","*.iml","/.project","/load-test*","*/*.swp","*/*.swx","*/.gitignore","*/node_modules*","*/4913","*/.git/*","*/.DS_Store","*/.dockerignore","*/*~","/.settings","/run-dev","/run-debug","/package-lock.json*","/nodejs_restclient.log","/nodejs_dc.log","/manifest.yml","/idt.js","/cli-config.yml","/README.md","/Jenkinsfile","/.cfignore"],"statusPingTimeout":30,"isHttps":false,"buildRequest":false,"sentProjectInfo":true}},"handler":{"requiredFiles":["/Dockerfile","/package.json"],"supportedType":"nodejs","detectChangeByExtension":false,"defaultIgnoredPath":["/.project","/run-dev","/run-debug","/package-lock.json*","/nodejs_restclient.log","/nodejs_dc.log","/manifest.yml","/idt.js","/cli-config.yml","/README.md","/Jenkinsfile","/.cfignore","/load-test*","*/node_modules*","*/.git/*","*/.DS_Store","*/*.swp","*/*.swx","*/4913","*/.dockerignore","*/.gitignore","*/*~","/.settings","/chart"]}}
[34m[06/02/20 01:28:32 Turbine] [TRACE][39m Key translated. Returning message: Build started
[34m[06/02/20 01:28:32 Turbine] [TRACE][39m Emitting project ranks
[36m[06/02/20 01:28:32 Turbine] [DEBUG][39m Running Builds queue: ["938ba370-487f-11ea-9c19-6b6b88d11df1","938ba370-487f-11ea-9c19-6b6b88d11df1"]
[36m[06/02/20 01:28:32 Turbine] [DEBUG][39m Builds in progress: 2
[36m[06/02/20 01:28:32 /portal/modules/FileWatcher.js] [DEBUG][39m Received filewatcher module response: { statusCode: 202 }
[36m[06/02/20 01:28:32 /portal/modules/FileWatcher.js] [DEBUG][39m Portal's event listener: projectStatusChanged
[36m[06/02/20 01:28:32 /portal/modules/FileWatcher.js] [DEBUG][39m Portal's event listener: {"projectID":"938ba370-487f-11ea-9c19-6b6b88d11df1","buildStatus":"inProgress","detailedBuildStatus":"Build started"}
[36m[06/02/20 01:28:32 /portal/modules/FileWatcher.js] [DEBUG][39m projectStatusChanged: inProgress (project 938ba370-487f-11ea-9c19-6b6b88d11df1)
[34m[06/02/20 01:28:32 /portal/modules/FileWatcher.js] [TRACE][39m projectStatusChanged: inProgress (project: {
"projectID": "938ba370-487f-11ea-9c19-6b6b88d11df1",
"buildStatus": "inProgress",
"detailedBuildStatus": "Build started"
})
[32m[06/02/20 01:28:32 node1] [INFO][39m Updating container for nodejs project /codewind-workspace/node1
[32m[06/02/20 01:28:32 node1] [INFO][39m The log name for 938ba370-487f-11ea-9c19-6b6b88d11df1 is cw-938ba370-487f-11ea-9c19-6b6b88d11df1-73a4443b8a5394c30156971bfa6e197e751cfdb7
[32m[06/02/20 01:28:32 node1] [INFO][39m Container name: cw-node1-938ba370-487f-11ea-9c19-6b6b88d11df1
[32m[06/02/20 01:28:32 node1] [INFO][39m Log name: cw-938ba370-487f-11ea-9c19-6b6b88d11df1-73a4443b8a5394c30156971bfa6e197e751cfdb7
[32m[06/02/20 01:28:32 node1] [INFO][39m forceAction: undefined
[32m[06/02/20 01:28:32 Turbine] [INFO][39m 938ba370-487f-11ea-9c19-6b6b88d11df1: Project 938ba370-487f-11ea-9c19-6b6b88d11df1 file changed
[34m[06/02/20 01:28:32 Turbine] [TRACE][39m The projectInfo has been updated for imagePushRegistry: {"projectID":"938ba370-487f-11ea-9c19-6b6b88d11df1","projectType":"nodejs","location":"/codewind-workspace/node1","autoBuildEnabled":true,"startMode":"run","appPorts":["3000"],"language":"nodejs","debugPort":"9229","ignoredPaths":["*/.idea/*","*.iml","/.project","/load-test*","*/*.swp","*/*.swx","*/.gitignore","*/node_modules*","*/4913","*/.git/*","*/.DS_Store","*/.dockerignore","*/*~","/.settings","/run-dev","/run-debug","/package-lock.json*","/nodejs_restclient.log","/nodejs_dc.log","/manifest.yml","/idt.js","/cli-config.yml","/README.md","/Jenkinsfile","/.cfignore"],"statusPingTimeout":30,"isHttps":false,"buildRequest":false,"sentProjectInfo":true}
[32m[06/02/20 01:28:32 node1] [INFO][39m executeBuildScript with args: {/file-watcher/scripts/nodejs-container.sh /codewind-workspace/node1 /codewind-workspace 938ba370-487f-11ea-9c19-6b6b88d11df1 update cw-node1-938ba370-487f-11ea-9c19-6b6b88d11df1 true cw-938ba370-487f-11ea-9c19-6b6b88d11df1-73a4443b8a5394c30156971bfa6e197e751cfdb7 run 9229 NONE node1-938ba370-487f-11ea-9c19-6b6b88d11df1 undefined}, In K8 environment: undefined
[32m[06/02/20 01:28:32 node1] [INFO][39m Now tracking running process 490 for project 938ba370-487f-11ea-9c19-6b6b88d11df1
[32m[06/02/20 01:28:32 node1] [INFO][39m Detected other file changes, Codewind will build the project
[32m[06/02/20 01:28:32 node1] [INFO][39m File change detected. Project will re-build.
[32m[06/02/20 01:28:32 node1] [INFO][39m Changed Files: [ >.project ]
[34m[06/02/20 01:28:32 node1] [TRACE][39m Setting new changed files in the changedFilesMap...
[34m[06/02/20 01:28:32 node1] [TRACE][39m Received last chunk for timestamp 1581017194453 , checking if still waiting for other chunks for other timestamps...
[32m[06/02/20 01:28:32 node1] [INFO][39m Received all chunks.
[32m[06/02/20 01:28:32 node1] [INFO][39m Auto build is enabled, build proceeding...
[32m[06/02/20 01:28:32 node1] [INFO][39m Build state changed for project: 938ba370-487f-11ea-9c19-6b6b88d11df1 from: success, to: inProgress
[32m[06/02/20 01:28:32 node1] [INFO][39m Emitting event
message: projectStatusChanged
data: {
"projectID": "938ba370-487f-11ea-9c19-6b6b88d11df1",
"buildStatus": "inProgress"
}
So it is doing the dockerbuild of a project and out of nowehere it kicks of another build request. Now the build queue, here will add it to the queue if it has space which is what it does. What I am not getting is what will kick off another build while dockerbuild is happening
Here is the filewatcher log in case that helps: filewatcherd-1.log
I will try it on my windows VM. It shouldn't matter eclipse vs VSCode right?
Theoretically it should not matter.
I tried on Win with both VSCode and Eclipse and they are working on:
Am I missing any specific configs?
Try creating 4 or 5 open liberty projects in a row (Codewind - not Appsody) using Eclipse. I reinstalled codewind today after a docker system prune -a -f --volumes
and can still reproduce. I also tried reproducing on a Windows VM I have for doing accessibility testing but the scenario seemed to crash docker (it is a very slow machine).
This only happens for OL projects? I tried it with one - same steps as you mentioned with some more: 1 node, 1 lib, 1 spr, 1 OL and 1 appsody microprofile
I found it a lot easier to reproduce with several OL projects.
This could be related to: https://github.com/eclipse/codewind/issues/2072 as I see [INFO][39m Changed Files: [ >.classpath ]
several times in the logs. Also [INFO][39m Changed Files: [ >.project ]
though I thought .project was already ignored.
I think so too. And as we see those changed files, it goes to adding to the queue where the same project is already building in progress. I have talked to Erin, and I am adding a guard against the running build queue where no one project can be in the running build queue at the same time.
From our investigation last week, we found the issue: on eclipse, the .project file triggers build once a maven project is building. this now keeps adding the project to the queue because build queue not works for update. As a result, the running build queue gets clogged with the same project id (first issue). I have a PR open to fix that so that all running build is unique at all times.
However, there is a second issue - .project
file should be ignored along with some other eclipse specific files here https://github.com/eclipse/codewind/issues/2072. I have seen other projects which would trigger a build when .project
changed. We need to fix https://github.com/eclipse/codewind/issues/2072 for the second half of this issue to be fixed.
@eharris369 can you please verify when you have a chance?
The only thing Erin can verify is that running build queues are unique. Other than that, the issue still persists as I described to you yesterday @rajivnathan until #2072 is fixed. I think we can still keep this open but prioritize #2072 and tag that as hot
Putting this back in progress because it sounds like @johnmcollier saw this problem occur with the fix in. @ssh24 to investigate further
Saw this on Che with the generic Docker project "mygo"
Sidecar logs (warning they're big): sidecar_filewatcher.log PFE Logs (warning, they're also big): pfe_logs.txt
@eharris369 @johnmcollier think this can now be verified along with https://github.com/eclipse/codewind/issues/2072
@eharris369 Looks like #2072 went in. I just verified on eclipse and it worked for me (5 fresh OL projects). Could you please verify?
Ah, Rajiv beat me to it haha
I tried various things and I can't get it to fail anymore so closing.
I somehow got into a state where a project is stuck in build queued state even though nothing else is building. I tried adding another project and it got stuck in build queued state as well:
Here is the full pfe log file:
logs.txt