Open pcm32 opened 2 years ago
I can also add that I ran this with 32 GB of RAM. Does it need more?
No, it wont need more RAM. This error is the scrouge on my back across all ontologies. It happens due to the flakiness of the FTP servers that serve CHEBI at the EBI data center. There is nothing you can do other than trying again later.
But looking at the log file, there is an urgent need to review the customised pipeline - The circularity warnings need to be fixed ASAP.
Thanks @matentzn . If I would override this line:
for some local path to the ChEBI ontology, would that work to avoid the FTP in the middle? Thanks!
I will try to find a more general solution for you, but cant be today. Does it have till Thursday?
It will take me longer now to describe how to do the workaround. This cycle also needs fixing!
Sure, Thu should be fine, thanks!
Thanks Nico. We should plan a review and update of the pipeline with @anitacaron & @gouttegd at some point soon.
I believe the circular dependency is similar to a problem previously highlighted in the ODK.
The seed.txt
target depends on $(SRCMERGED)
, which depends on both $(SRC)
(the -edit file, which in SCAO is mostly empty as it only contains imports) and on $(OTHERSRC)
(which contains the components fbbt.owl
and cl.owl
). So ultimately, the seed.txt
file can only be created once the fbbt.owl
and cl.owl
components have been generated.
But the fbbt.owl
component depends on fbbt_simple_seed.txt
, which itself (as all %_simple_seed.txt
files) depends on seed.txt
. So now we need seed.txt
to build fbbt.owl
, which we need to build seed.txt
. BOOM.
A quick and dirty workaround would be patch the standard Makefile
to make seed.txt
depend only on $(SRC)
and not on $(OTHERSRC)
.
But I think a deeper review of the pipeline to streamline the dependencies would be a much better solution.
@pcm32
Try the pipeline now again - with a bit of luck it will work even without you using any parameters.. In case chebi and/or ncbitaxon give you grief let me know and I will show you how to skip over them using a specific parameter.
I have setup the scatlas generation pipeline from a singularity container (see PR #30, based on the same container used from docker, stemming from your current master, please let me know if I should be basing on a different branch) so that it can run on the cluster from our internal CI. However, on running:
I'm getting some errors (look at the bottom mostly):
Could you advice on what could be the problem with regards to the tmp/seed.txt? or the ZLIB error? or is this expected? The process currently is coming out with an error code.
Pinging @gouttegd @matentzn as advised by @dosumis . Thanks!