Open splitice opened 4 years ago
Thank you for reporting this issue. Would you be able to share some additional details to help us reproduce and debug it?
--debug
flag or DEBUG=1
env var).balena version -a
.npm
installation.Also:
--detached
option:$ balena push --help
...
OPTIONS
-d, --detached
When pushing to the cloud, this option will cause the build to start, then
return execution back to the shell, with the status and release ID (if
applicable). When pushing to a local mode device, this option will cause
the command to not tail application logs when the build has completed.
Thanks @pdcastro I'm trying to work out what attribute of our build leads to this issue in the background.
Unfortunately with ~20 minute builds an issue that occurs on 1/7 and a busy team who need a working CI it's a project on the backburner.
The environment is Linux (Github Actions) and the balena-cli version is the latest stable at build time (https://github.com/HalleyAssist/push-to-balenacloud).
I'm working through test a hypothesis that it occurs when a certain number of bytes have been transmitted currently as it doesn't seem to occur during silent periods. I've reduced the output from our build scripts to verify this and will be running builds today & tomorrow.
@pdcastro I've managed to get 6 consecutive passing builds by piping the output of a large tar extract command to /dev/null
thereby reducing the amount of output. The output is still quite long (we get Earlier logs truncated...
in the txt files on your end) but it's half of what it was.
I'll need to do a few more builds to ensure that I'm not just rolling the dice correctly. There wouldn't by any chance be a max response body size limit on your end (or something similar)?
Headless mode isnt really suitable for us as capturing the output of balena push
/ git push
is the only way for us to get the build output. It's otherwise truncated on your end if retrieved by txt file.
Thank you for sharing these results @splitice. I am not aware of a max response body size limit, other than the truncation you mentioned - which shouldn't cause the CLI to end mid build. If the amount of logs was really large, I wonder if the CLI process might be reaching some Node.js limit - but again I am not aware of what that limit would be, and capturing the CLI output in debug mode (DEBUG=1
env var or --debug
flag) might give us a clue.
Also, a couple of suggestions:
Headless mode isnt really suitable for us as capturing the output of balena push / git push is the only way for us to get the build output.
What about redirecting the output of your build commands / script to a text file saved on the image itself? A before/after example:
Dockerfile
before:
...
RUN build-script.sh
Dockerfile
after:
...
SHELL ["/bin/bash", "-c"]
RUN build-script.sh &> /tmp/image-build-output.txt
(I've selected bash
as the shell so I could use &>
redirection.)
Or also using the tee command to have both live output and saving to a file:
...
SHELL ["/bin/bash", "-c"]
RUN build-script.sh |& tee /tmp/image-build-output.txt
Then you might be able to use headless mode without losing the full logs.
The environment is Linux (Github Actions) and the balena-cli version is the latest stable at build time
Interesting! Hopefully GitHub isn't introducing additional issues - like killing the CLI process / container because of their own resource usage limits. By the way, using the latest CLI build is probably a good thing generally, but perhaps not as good while trying to isolate an issue, as the CLI version may have changed compared to "previous observations" of the issue.
As we used Github both before and after with git push
and balena-cli
I wouldnt expect them to be at fault here. I should however test that when I have time.
As a test I would suggest using the extraction of a large (many file) tar archive. As that's what I directed to /dev/null to largely solve the issue.
Ive been using
git push
in my CI process for years now. Recently opted to upgrade tobalena push
as suggested.Everything was going fine until we noticed more failures than normal.
It's not that the build fails, but the balena push command ends mid build and returns and error code.
The build itself continues on the remote cloud build server and completes successfully however