kavonrtep / dante_ltr

GNU General Public License v3.0
9 stars 0 forks source link

"/" in reference sequence names #3

Closed crimBubble closed 1 year ago

crimBubble commented 1 year ago

Hi, I noticed that danteltr seems to really dislike / in reference sequence names. I am not sure exactly why but replacing the / in the names with underscores `` prevented the following error to occur:

dante_ltr -g run2_v7-26-yahs-scaf9/dante_out/old/output_domains.gff -s run2_v7-26-yahs-scaf9/Inpactor2_library.fasta -o run2_v7-26-yahs-scaf9/dante_out/old/danteLTR -c 12

...

[main] Real time: 0.005 sec; CPU: 0.007 sec; Peak RSS: 0.010 GB
done.
Number of putative TE with identified LTR   : 596 
Warning message:
In mclapply(seq_along(gr), function(x) get_TE(s_left[x], s_right[x],  :
  596 function calls resulted in an error
Identification of PBS ...done
Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'table' in selecting a method for function '%in%': $ operator is invalid for atomic vectors
Calls: sapply ... sapply -> lapply -> FUN -> %in% -> .handleSimpleError -> h
In addition: Warning messages:
1: In mclapply(seq_len(n), do_one, mc.preschedule = mc.preschedule,  :
  all scheduled cores encountered errors in user code
2: In mclapply(gff3_list, FUN = add_pbs, s = s, trna_db = trna_db,  :
  596 function calls resulted in an error
Execution halted
Traceback (most recent call last):
  File "/scratch_io/software/anaconda3/envs/repeatexplorer/bin/dante_ltr", line 638, in <module>
    main()
  File "/scratch_io/software/anaconda3/envs/repeatexplorer/bin/dante_ltr", line 626, in main
    subprocess.check_call(
  File "/scratch_io/software/anaconda3/envs/repeatexplorer/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/scratch_io/software/anaconda3/envs/repeatexplorer/share/dante_ltr/detect_putative_ltr.R', '-s', 'run2_v7-26-yahs-scaf9/Inpactor2_library.fasta', '-g', 'run2_v7-26-yahs-scaf9/dante_out/old/output_domains.gff', '-o', 'run2_v7-26-yahs-scaf9/dante_out/old/danteLTR', '-c', '12', '-M', '0', '-L', '0.6']' returned non-zero exit status 1.

I guess that might be related to the classification/naming convention used in RexDB. So for everybody else running into that issue, I replaced all / (and also # just to be sure) in my reference sequence names with:

sed -i -e 's/\//_/g' -e 's/#/__/g' <reference.fasta>

So far I really like working with dante_ltr, great tool :)