conchoecia / odp

oxford dot plots
GNU General Public License v3.0
131 stars 9 forks source link

problem converting gff #37

Open pedres opened 1 year ago

pedres commented 1 year ago

Hi, I am trying to convert a gff to a .chrom. The script works with the example python NCBIgff2chrom.py GCF_000001405.39_GRCh38.p13_genomic.gff.gz > GCF_000001405.39_GRCh38.p13_genomic.chrom. However, it does not work with the two genomes I want to compare (two earthworm species, accessions GWHACBE00000000 and GWHAOSM00000000.1 from https://ngdc.cncb.ac.cn/). Below I paste the first two entries of each gff file, that I think are not different from those of human gff file. What could be the problem? Thank you very much for your help.

OriSeqID=Chr01 Accession=GWHACBE00000001

GWHACBE00000001 EVM gene 1225623 1239308 . - . ID=evm.TU.Chr01.49;Accession=GWHGACBE000049;Name=evm.TU.Chr01.49 GWHACBE00000001 EVM mRNA 1225623 1239308 . - . ID=evm.model.Chr01.49;Accession=GWHTACBE000049;Parent=evm.TU.Chr01.49;Parent_Accession=GWHGACBE000049

OriSeqID=Contig0 Accession=GWHAOSM00000001.1

GWHAOSM00000001.1 EVM gene 669375 680474 . - . ID=evm.TU.Contig0.5;Accession=GWHGAOSM000001.1;Name=EVM%20prediction%20Contig0.5;transl_table=1 GWHAOSM00000001.1 EVM mRNA 669375 680474 . - . ID=evm.model.Contig0.5;Accession=GWHTAOSM000001.1;Parent=evm.TU.Contig0.5;Parent_Accession=GWHGAOSM000001.1;Name=EVM%20prediction%20Contig0.5;transl_table=1

conchoecia commented 1 year ago

Thanks for your comment! You will most likely have to parse the .gff file yourself with bash, awk, sed, cut, et cetera to create the .chrom file if the script that I provided doesn't work.

The .gff format is poorly defined and inconsistent between sources, so unfortunately it didn't work in this case. If you want to modify the script and create a pull request that would be great!

pedres commented 1 year ago

Thanks for the suggestion, I will try it

conchoecia commented 8 months ago

Note - this should be fixed when decay-branch is folded in