ythuang0522 / homopolish

High-quality Nanopore-only genome polisher
GNU General Public License v3.0
65 stars 12 forks source link

Using homopolish after pilon directly #38

Closed JamesYang1209 closed 2 years ago

JamesYang1209 commented 3 years ago

Hi professor,

Thanks for the new update. I have found that homopolish will crash if I use it after pilon directly. I think the bug is caused by the sequence ID contains " | " which created by pilon. (>Seq1|sizeXXX|pilon) After I substitute " | " to "_" and everything goes well.

[2021/08/04 11:33] INFO: RUN-ID: scaffold2|size36705|pilon
scaffold2|size36705|pilon
/home/james/small_genome_pipeline/case_analysis/BI-ANA-2919/S118_3/03.SeqPolish/debug
[2021/08/04 11:33] INFO: Stage: Select closely-related genomes
sh: size36705: command not found
sh: size36705: command not found
sh: pilon/scaffold2: No such file or directory
sh: pilon.fasta: command not found
sh: pilon/temp.tab: No such file or directory
sh: size36705: command not found

ERROR: Did not find fasta records in "input files".
sh: size36705: command not found
sh: size36705: command not found
sh: pilon/temp.sort.tab: No such file or directory
sh: pilon/temp.tab: No such file or directory
sh: size36705: command not found
sh: pilon.sort.tab: command not found
sh: size36705: command not found
sh: size36705: command not found
sh: pilon/scaffold2: No such file or directory
sh: pilon/temp.sort.tab: No such file or directory
Traceback (most recent call last):
  File "/home/james/tools/homopolish-0.3.1/homopolish.py", line 58, in <module>
    main()
  File "/home/james/tools/homopolish-0.3.1/homopolish.py", line 42, in main
    FLAGS.output_dir, FLAGS.minimap_args, FLAGS.mash_threshold, FLAGS.download_contig_nums, FLAGS.debug, FLAGS.meta, FLAGS.local_DB_path)
  File "/home/james/tools/homopolish-0.3.1/modules/polish_interface.py", line 329, in polish_genome
    out = without_genus(out, assembly_name, output_dir_debug, mash_screen, assembly, model_path, sketch_path, genus_species, threads, output_dir, minimap_args, mash_threshold, download_contig_nums, debug, meta)
  File "/home/james/tools/homopolish-0.3.1/modules/polish_interface.py", line 258, in without_genus
    download_contig_nums, contig_name, contig.id)
  File "/home/james/tools/homopolish-0.3.1/modules/polish_interface.py", line 45, in mash_select_closely_related
contig_id)
  File "/home/james/tools/homopolish-0.3.1/modules/mash.py", line 36, in dist
os.remove('{output_dir}/temp.tab'.format(output_dir=output_dir))
FileNotFoundError: [Errno 2] No such file or directory: 'temp.tab'
ythuang0522 commented 3 years ago

Thanks for reporting this issue. We will fix this issue in next major upgrade. As pilon was designed for short read polishing, we would like to know why not using racon for long read polishing instead.

JamesYang1209 commented 3 years ago

Hi, My workflow is assembly -> racon -> medaka -> pilon -> homopolish to use both long read and short read for polish. Could you advise me if there is any concern about my workflow? Thank you.

ythuang0522 commented 3 years ago

Oh. You are doing hybrid assembly. I would put homopolish right after medaka as our models were trained for improving sys errors left by medaka. And I’d move pilon polishing with short reads as the last step. This is, to my best knowledge, the workflow used by other groups, though not comprehendively evaluated.

JamesYang1209 commented 2 years ago

Thank you for the suggestion, I will follow this workflow.