malonge / RagTag

Tools for fast and flexible genome assembly scaffolding and improvement
MIT License
470 stars 47 forks source link

updategff issues #166

Open El-Castor opened 1 year ago

El-Castor commented 1 year ago

Hi !

I have correct and scaffold my assembly regarding a close reference genome already sequenced. Now I am trying to update my gff from my no correcting assembly to the scaffolded one.

I have scafold my assembly using this command :

ragtag.py scaffold $geneome_faste_ref_DIR $outDir/ragtag.correct.fasta -o $outDir_scaffolding -t 16 -C

with

geneome_faste_ref_DIR: path to my genome of reference outDir/ragtag.correct.fasta: path to my assembly corrected first outDir_scaffolding: output directory

Then I try to update my gff following this command:

ragtag.py updategff $gff_PATH $current_agpScafolded_PATH

with: gff_PATH: The gff file produce with augustus and related to my non corrected/scafold assembly (query) current_agpScafolded_PATH: AGP ouput file from the ragtag.py scaffold command

Running this command I have an issue:

 (ragtag) cpichot@node13:/NetScratch/FLOCAD/cpichot/genome_assembling/P.sativum_DGL_genomeAssembly$ ragtag.py updategff $gff_PATH_TEST $current_agpScafolded_PATH_TEST
Mon Jul 31 16:09:21 2023 --- VERSION: RagTag v2.1.0
Mon Jul 31 16:09:21 2023 --- CMD: ragtag.py updategff /NetScratch/FLOCAD/cpichot/genome_assembling/P.sativum_DGL_genomeAssembly/dgl_contigOfInterestOnly.20230210.gff3 /NetScratch/FLOCAD/cpichot/genome_assembling/1-improveAssembly/out-ragtag-Allgenome/ragtag.correct.agp
##gff-version 3
##sequence-region ptg000001l 1 642050

ptg000001l  BioFileConverter    gene    13645   14031   .   +   .ID=gene:ptg000001l.1;Name=ptg000001l.1;locus_tag=Ps_ptg000001lg000010
Traceback (most recent call last):
  File "/NetScratch/cpichot/.conda/envs/ragtag/bin/ragtag_update_gff.py", line 162, in <module>
    main()
  File "/NetScratch/cpichot/.conda/envs/ragtag/bin/ragtag_update_gff.py", line 156, in main
    sup_update(gff_file, agp_file)
  File "/NetScratch/cpichot/.conda/envs/ragtag/bin/ragtag_update_gff.py", line 114, in sup_update
    raise ValueError("Inconsistent input files.")

here you have a head of the AGP file:

## agp-version 2.1
# AGP created by RagTag v2.1.0
chr1LG6_RagTag  1   134342  1   W   ptg002004l_431067_565408_+  1   134342  +
chr1LG6_RagTag  134343  134442  2   U   100 scaffold    yes align_genus
chr1LG6_RagTag  134443  177232  3   W   ptg002087l_457179_499968_+  1   42790   +
chr1LG6_RagTag  177233  177332  4   U   100 scaffold    yes align_genus
chr1LG6_RagTag  177333  232767  5   W   ptg001609l_360920_416354_+  1   55435   +
chr1LG6_RagTag  232768  232867  6   U   100 scaffold    yes align_genus
chr1LG6_RagTag  232868  300311  7   W   ptg002004l_1_67444_+    1   67444   +
chr1LG6_RagTag  300312  300411  8   U   100 scaffold    yes align_genus

Do you see my mistake ? What should I do to correct this issue please?

Thanks in advance!