Closed fertinaz closed 5 years ago
If the image is too big, the instance will get overwhelmed (and Google) automatically kills it. The signal can't be sent back to Singularity Hub despite the kill, so it only looks like it's running. You should assess the final size of your image, and then account for the following:
It looks like you are downloading and compiling a LOT and it's hitting the limits of disk space at some point. There are several things you can try!
I'd do a debug run (locally) where you can print out the sizes of final directories (for each application) and then the final image. That should be a good start.
Hello
Thanks for the tips. I will try a cleaner version. Is there a way to follow the log files on the Google side?
// Fatih
I can't give anyone direct access to the instance, but If you give me a heads up when it's running, I can shell into the instance and monitor.
I guess I found out the root-cause.
Probably parallel compilation swaps if
cat /proc/cpuinfo | grep "processor" | wc -l
returns number of threads rather than the physical cores.
I didn't check the size of temp files generated on-the-fly but final image is around 750 MB. So, I don't think this is a disk issue.
Now I switched to singularity-3 and applying a serial compilation, hopefully this will make a difference. I will let you know once my local test finishes.
Thank you.
So, I've just triggered a new build after my latest commit. Not much has changed anyway.
Can you let me know if it hangs?
okay the one you triggered was already dead - I've triggered a new one and I'm going to ssh in, so please hold tight on pushing / changing the collection.
How long does this normally take? I'm sitting here watching it still...
Should take about 3-4 hours, I don't suggest watching it at all...
3-4 hours? Wait, you know that the builder limit is 2 hours right? https://github.com/singularityhub/singularity-python/blob/master/singularity/build/scripts/singularity-build-latest.sh#L30
There might be a kill signal in there regardless (if you did start when you mentioned, for example it was already gone at 44 minutes) but generally a 3-4 hour build is not something Singularity Hub currnetly supports.
Oh, okay I didn't know the 2 hours limit. I was just aware of the builder specifications provided in the wiki page. That explains. Thank you
Good point, I will add that right away!
Okay added to docs! Actually this is very good news - it means that you can build a base image (I usually use docker but you could do shub do) that does a big chunk of the compiling, and then build the final image on top of that. Do you want to try that?
That's a good idea. Main package compilation is the most time consuming part, but compiling third party tools also takes considerable amount of time. I will try that.
Hey @fertinaz are you all set here?
Non response, closing issue. @fertinaz I apologize that this didn't work, and hopefully if/when we update the builders it will resolve.
Link to Container Collection Log, Build, or Collection (in that order)
Collection: https://www.singularity-hub.org/collections/1859
Behavior when Building Locally
It builds successfully on my CentOS-7 workstation which has 4 cores and 16GB memory.
But this is a relatively complex and time consuming build, because it compiles a large CFD package and its dependencies from scratch.
Error on Singularity Hub
It is stuck at "Running" state after 2-3 days.
What do you think is going on?
I've seen similar issues for some large build recipes. Perhaps this is hanging at some point but I don't know where exactly. Can you help me out with that?
Thank you
// Fatih