NCBI-Hackathons / ViruSpy

A pipeline for viral identification from metagenomic samples
MIT License
26 stars 4 forks source link

BUD Algorithm Bug #2

Closed glickmac closed 7 years ago

glickmac commented 7 years ago

There is a memory error with the implementation of the bud algorithm. The process of checking if a fasta file is extended is not completed due to too many files being created for individual contigs to check against the contigs_extend perl script. Either moving the fasta data into a dictionary or reading each contig by line is necessary to avoid creating the large amount of files for both the input and output.

pcantalupo commented 7 years ago

Please post the exact error message. Do you have a small test set of input sequences that can reproduce the error?

glickmac commented 7 years ago

./budding.sh: line 66: /usr/local/bin/rename: Argument list too long

The error seems to stem from the process of splitting each contig into a unique file. I will post the test set I am using onto the Slack with the necessary files (Contig file needed + perl scripts) Caution: this may take an hour to assembly even after ends are the input.

pcantalupo commented 7 years ago

You could fix this with xargs https://stackoverflow.com/questions/11289551/argument-list-too-long-error-for-rm-cp-mv-commands#11289567

But why do you need the rename? Probably because of line 69 (the for loop) because it doesn't work with files names with spaces. You can fix this by setting the IFS variable appropriately. https://www.cyberciti.biz/tips/handling-filenames-with-spaces-in-bash.html

glickmac commented 7 years ago

I think you identified the problem! I am going to try to put the rename command into a for loop. If that doesn't work then I will try removing the rename and operating with the IFS variable.

DCGenomics commented 7 years ago

You people are my heroes

On Oct 4, 2017 13:04, "Cody Glickman" notifications@github.com wrote:

I think you identified the problem! I am going to try to put the rename command into a for loop. If that doesn't work then I will try removing the rename and operating with the IFS variable.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NCBI-Hackathons/ViruSpy/issues/2#issuecomment-334223112, or mute the thread https://github.com/notifications/unsubscribe-auth/AFePtW4Xn8w96F-rEykBzbVaMGF89aLiks5so7qfgaJpZM4PpFJH .

glickmac commented 7 years ago

The loop removes the bug! Woot!