MicrobeLab / DeepMicrobes

DeepMicrobes: taxonomic classification for metagenomics with deep learning
https://doi.org/10.1093/nargab/lqaa009
Apache License 2.0
81 stars 21 forks source link

tfrec_train_kmer.sh tries to create directories in such a way that script fails. #7

Closed lambertsbennett closed 2 years ago

lambertsbennett commented 3 years ago

When using the helper script to convert the sequence fasta file that I'm interested in using for training I run the following command:

./tfrec_train_kmer.sh -i ~/Documents/Projects/NLP-binning-approaches/data/Combined_transcripts.fasta 
-v ~/Documents/Projects/NLP-binning-approaches/data/mmetsp.vocab -o mmetsp.train.tfrec -k 12

Regardless of whether I split sequences or not I get the following output:

  1. Shuffling sequences for training... mkdir: cannot create directory ‘tmptfrec/home/ben/Documents/Projects/NLP-binning-approaches/data/Combined_transcripts.fasta’: No such file or directory ./tfrec_train_kmer.sh: line 111: tmptfrec/home/ben/Documents/Projects/NLP-binning-approaches/data/Combinedtranscripts.fasta/shuffled/home/ben/Documents/Projects/NLP-binning-approaches/data/Combined_transcripts.fasta: No such file or directory

./tfrec_train_kmer.sh: line 116: cd: tmptfrec/home/ben/Documents/Projects/NLP-binning-approaches/data/Combinedtranscripts.fasta: No such file or directory split: invalid number of lines: ‘0’: Numerical result out of range rm: cannot remove 'shuffled/home/ben/Documents/Projects/NLP-binning-approaches/data/Combined_transcripts.fasta': No such file or directory

  1. Converting to TFRecord... ls: cannot access 'subset': No such file or directory Can't use 'defined(@array)' (Maybe you should just omit the defined()?) at /home/ben/Documents/Projects/DeepMicrobes/bin/parallel line 119. cat: 'subset.tfrec': No such file or directory rm: cannot remove 'subset*.tfrec': No such file or directory rmdir: failed to remove 'tmptfrec/home/ben/Documents/Projects/NLP-binning-approaches/data/Combined_transcripts.fasta': No such file or directory Finished.

It looks like instead of adding the temporary directories to the end of the given paths its trying to prepend them, which is causing all sorts of issues. Is there something that I am missing?

MicrobeLab commented 3 years ago

Yes, you're right. The script assumes that the input fasta (-i) is not a full path. That's why mkdir failed at the beginning. You could modify that line of code to allow the script to add the temporary directories to the end of the given paths. Thanks.