chapmanb / bcbb

Incubator for useful bioinformatics code, primarily in Python and R
http://bcbio.wordpress.com
603 stars 243 forks source link

Regarding gff2_to_gff3.py script #111

Open jeffinrockey opened 7 years ago

jeffinrockey commented 7 years ago

Hi, Came across a issue after using gff2_to_gff3.py script on a gff from GIGA DB.

Steps to reproduce. wget -c ftp://climb.genomics.cn/pub/10.5524/100001_101000/100028/PigeonPea_V5.0.gene.gff.gz python bcbb/gff/Scripts/gff/gff2_to_gff3.py PigeonPea_V5.0.gene.gff gff3ToGenePred PigeonPea_V5.0.gene.gff3 PigeonPea_V5.0.gene.genePred

(gff3ToGenePred is from UCSC tools and the following error came up .. PigeonPea_V5.0.gene.gff3:39595: expected name=value: = PigeonPea_V5.0.gene.gff3:39596: expected name=value: = GFF3: 51 parser errors )

Then I cross checked with genometools also

$ gt gff3validator PigeonPea_V5.0.gene.gff3 gt gff3validator: error: attribute "=" on line 928 in file "PigeonPea_V5.0.gene.gff3" has no tag

The problematic lines which causes the issue is as follows

CcLG01  GlimmerHMM      mRNA    1872745 1876232 .       -       .       ID=C.cajan_19370;evid_id=CcLG01.path1.gene185
CcLG01  GlimmerHMM      CDS     1872745 1873186 .       -       1       =;Parent=C.cajan_19370
CcLG01  GlimmerHMM      CDS     1873701 1873731 .       -       2       =;Parent=C.cajan_19370

As seen, though no error was shown during the conversion to gff3 , error came down stream. (Initially I saw GLEAN in the gff file and came to know that GLEAN gives a GFF2. Hence attempting this script). Would like to get your advice on this.

Regards, Jeffin Rockey