Closed mkim0327 closed 3 years ago
Hi there!
It seems that the program can't find the output of RepeatMasker. There could be several reasons, but primarily I would first check if Trinity has run normally by looking for the file "Trinity.fasta" in the output. If the file is present and contains fasta sequences, this is working. The next step would be to check if RepeatMasker is installed properly.
Can you send me the full log (stdout and stderr) as well as the complete command line you use to run dnaPipeTE? I will be able to pinpoint the source of error!
Thanks for using dnaPipeTE!
Cheers,
Clément
Hi!
I got the same error while trying to run the analysis on the test dataset. Trinity seems to be working fine, as I get a Trinity.fasta file. I think the problem arises here:
"Species "All" is not known to RepeatMasker. There may not be any TE families defined in the libraries for this species/clade or there may be an error in the spelling. Please check your entry against the NCBI Taxonomy database and/or try using a broader clade or related species instead. The full list of species/clades defined in the library may be obtained using the famdb.py script."
I attached the full log for reference. log.txt
My command line is: python3 ./dnaPipeTE.py -input ./test/test_dataset.fastq -output ~/Thesis/dnaPipeTE/test -genome_size 2000000 -genome_coverage 0.5 -sample_number 2
Cheers, Christian
Hello,
Yes it is probably due to the fact that this mode was working previously if you had the Repbase libraries, and it looks like your version of RepeatMasker is more recent (4.1.x) than the one I use with my dnaPipeTE install (4.0.x).
Right now I would recommend to use an external TE library with the option -RM_lib <file.fasta>
. If you have older version of RepeatMasker you could try to locate the file called "specieslib" buried somewhere in the RepeatMasker/Libraries/
subfolders. It used to be the fasta file generated from Repbase at the time there was no subscription. Please DM me if you need help with that.
Both your comments makes me think that I need to do something about it, and I will try to fix this issue soon.
Best,
Clément
Hi,
It turns out that RepeatMasker (4.1.x) will build the libraries if the -species flag is specified. For the test dataset, I added "-species diptera", and it worked!
Cheers,
Christian
Excellent! Thank you for sharing the tip =)
Cheers,
Clément
Hi Clément,
Unfortunately, I've run into the same error. The reason I request your help is that I'm already using an external TE library. Please find below my prompt, as well as the STD_OUTPUT here and
STD_ERROR here:
python3 /users/yvan/dnaPipeTE.py \ -input dnaPipeTEst/data/Darwinula_stevensoni_250_350_pass_paired_1_fixed.fastq \ -output dnaPipeTEst/output_dstevensoni/ \ -cpu 8 \ -sample_number 2 \ -genome_size 455000000 \ -genome_coverage 0.15 \ -RM_lib Ds_ONT_EarlGrey/Ds_ONT_EarlGrey_Database/Ds_ONT_EarlGrey-families.fa \ -RM_t 0.2 \ -contig_length 200 \ 1>/users/yvan/output.txt 2>/users/yvan/error.txt
Thank you kindly in advance for your help! Looking forward to obtaining results with your wonderful tool.
Hello Yelle,
Thanks for your kind words! First I would like to ask you if you are using the docker/singularity version of dnaPipeTE. I see in your error files that the version say "container", but from your command line I have the impression that you are actually not using the container for the deps. If that's the case, I encourage you to use it as described here: https://tehub.org/tutorials/docs/dnaPipeTE
If you are already doing this, let me know, and I'll dig deeper!
Cheers,
Clément
Hello!
I am trying to run dnaPipeTE with publicly available WGS data.
I will greatly appreciate it if you can help me with the following error message:
parseTagData: ID field not to EMBL spec "SNAP-OL2 repeatmasker; DNA; ???; BP. " from DE RepbaseID: SNAP-OL2XX
at /home/Softwares/dnaPipeTE/bin/RepeatMasker/RepeatMasker line 7611. Traceback (most recent call last): File "./dnaPipeTE.py", line 698, in
RepeatMasker(config['DEFAULT']['RepeatMasker'], args.RepeatMasker_library, args.RM_species, args.cpu, args.output_folder, args.RM_threshold)
File "./dnaPipeTE.py", line 381, in init
self.repeatmasker_run()
File "./dnaPipeTE.py", line 400, in repeatmasker_run
with open(self.output_folder+"/Trinity.fasta.out", 'r') as trinity_handle:
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/data/Data/dnapipete_3/Trinity.fasta.out'