chrisquince / DESMAN

De novo Extraction of Strains from MetAgeNomes
Other
69 stars 22 forks source link

Error running automated workflow on test data #28

Open pug-ventures opened 6 years ago

pug-ventures commented 6 years ago

Hi Chris!

I've been having troubles running the automated workflow on the test data.

The problem seems to be that files aren't found in the findEliteGenes step. I've attached is the nextflow log file.

nextflow_user.log

I feel like there's a pretty simple fix to this, but my amateurish messing with desmanflow.nf didn't bring any good results so far...

Thanks for your help!

njdbickhart commented 6 years ago

I'm currently running the same workflow and ran into several more problems related to the "findEliteGenes process" as well. It turns out that the Phylosift database files are no longer available online at the default weblinks (see https://github.com/gjospin/PhyloSift/issues/492 for more details). To add insult to injury, the Phylosift developers changed the naming of the marker files, so it appears that the "DNGNGWU*.codon.updated.1" marker files are no longer present in the current database.

Here's my workaround:

  1. Make sure that Phylosift is extracted from the tarball in the /DESMAN/external/phylosift*/ directory
  2. Create a new desmanflow.nf script (ie. $ cp desmanflow.nf .nf)
  3. In the new desmanflow script, delete line 66 (the tar -xvf command to open up the phylosift tarball). If you fail to do this, your changes to the phylosift config file will be overwritten each time!
  4. Download the phylosift dependency folders from the link in issue #492 and install them by doing the following: a. Unpack the markers_2014 and ncbi tarballs into the same directory b. Make sure that you do NOT unpack the "plain" markers.tgz file -- that does not contain the SCGs that DESMAN searches for
  5. In the external/phylosift folder, edit the "phylosiftrc" file and make the following changes: a. Remove the comments (the '#' prefixes) from the $marker_dir and $ncbi_dir variables, and set them to equal the path to the BASE directory of your recently downloaded phylosift markers and ncbi folders, respectively (ie. if you unpacked the markers folder in this directory: /path/to/my/database/ , set the value of $marker_dir and $ncbi_dir to "/path/to/my/database/" b. Remove the comment from the $disable_update_check=1 line. Otherwise, phylosift will crash each time since it cannot access the original websites that hosted the marker files.
  6. Run $DESMANHOME/external/phyosift*/phylosift index --debug to index your markers in the database directory.
  7. Run this command: echo "Arrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrgggg!"

That should properly index your marker files for phylosift, fix the bugs in the workflow and allow you to progress to the next processes in the desman nextflow script.

pug-ventures commented 5 years ago

Run this command: echo "Arrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrgggg!"

That genuinely made me laugh, Thanks! :D