COMBINE-lab / cuttlefish

Building the compacted de Bruijn graph efficiently from references or reads.
BSD 3-Clause "New" or "Revised" License
81 stars 9 forks source link

Cuttlefish 2.0 runable? #10

Closed rickbeeloo closed 2 years ago

rickbeeloo commented 2 years ago

Hey @jamshed!

I was just reading the new paper about cuttlefish 2.0 and it looks really cool!

Having access to the decompacted graph via --path-cover might be exactly what we need for our analysis as we spent quite some time on decompacting parts of the graph from 1.0. I'm curious what information this exactly holds so I cloned the develop branch but I can't seem to run it yet.

cuttlefish build --ref -d test_fastas/ -k 31 -c 1 -o cuttle2_test -t 70 -w tmp/ --path-cover

First I got an error:

output folder does not exist

Since there is no "output folder" argument, and -w seemed fine (also specifying-o didn't change it) I reverted to directory check commit and while it passed the first part it now raised the following error:

Error: Cannot open temporary file /kmc_00000.bin

Any idea how I can still run it?

Thanks!

jamshed commented 2 years ago

Hi @rickbeeloo, glad to see that you're still utilizing cuttlefish!

I've fixed the empty directory bug with the commit eae3cef896fd426b65ffe85bbdb2cfb214f01c7a. Let us know if it solves the issue!

For the second error with opening temporary KMC files, what working directory, i.e. -w were you using there? Did you have the correct access-permissions for such? I see that your error-log mentions /kmc_00000.bin—were you using the root directory (/) for the temporary files?

Also an aside: you mention that you have to "decompact parts of the graph" from our compacted graph output. Could you please detail a bit on the "decompaction" operation that you're using? That could help better understand your use-case, and maybe enable us help you better with cuttlefish!

Regards!

rickbeeloo commented 2 years ago

Hey @jamshed, will answer the decompaction part in more detail later! Indeed the directory bug is solved :)

Regarding the -w, it does not seem to make a difference what path I specify, for instance: ./cuttlefish build --ref -d ~/test_fastas/ -k 31 -c 1 -o cuttle2_test -t 70 path-cover -w ~/tools/cuttlefish2_latest/cuttlefish/bin/ will also give me:

Structural information for the de Bruijn graph is written to cuttle2_test.json. Error: Cannot open temporary file ./kmc_01021.bin

jamshed commented 2 years ago

Hi @rickbeeloo: could you please check whether you have enough space at the disk for the temporary files?

rickbeeloo commented 2 years ago

I see that it's not -w that determines where the kmc files are written but -o. Regardless I still get the same error:

Structural information for the de Bruijn graph is written to /net/...../out.json. Error: Cannot open temporary file /net/......./kmc_01021.bin

If I do:

It's 3am here, will go over it one more time later today

jamshed commented 2 years ago

Hi @rickbeeloo: thanks for checking these information. Any chance that you're executing these on OS X? Also, could you please check whether running the following before executing cuttlefish helps?

ulimit -n 2048