legumeinfo / datastore-issues

mostly for issues pertaining to the content of the legumeinfo datastore; may also relate to characteristics of its user interface or managing the mirroring process to the legfed instance
Other
1 stars 0 forks source link

glyma.Zh13.gnm2.ann1.FJ3G.gene_models_main.gff3.gz is not sorted #172

Closed sammyjava closed 1 year ago

sammyjava commented 1 year ago
[convertFile] ## Validating glyma collection Zh13.gnm2.ann1.FJ3G
[convertFile]  - glyma.Zh13.gnm2.ann1.FJ3G.gene_models_main.gff3.gz
[convertFile] ## INVALID: glyma.Zh13.gnm2.ann1.FJ3G.gene_models_main.gff3.gz record parent attribute is invalid; does the file need sorting?
[convertFile] ## INVALID: glyma.Zh13.gnm2.Chr01      ShortStack      miRNA   -5336733        -5336711        0.0     .       ID=glyma.Zh13.gnm2.ann1.SoyZH13_miRNA_001-3p;Name=SoyZH13_miRNA_001-3p;Parent=glyma.Zh13.gnm2.ann1.SoyZH13_miRNA_001
[convertFile] ## INVALID: {Parent=glyma.Zh13.gnm2.ann1.SoyZH13_miRNA_001, ID=glyma.Zh13.gnm2.ann1.SoyZH13_miRNA_001-3p, Name=SoyZH13_miRNA_001-3p}
sammyjava commented 1 year ago

I'm not sure who to blame for this so I assigned it to both of you to pass the blame on. This blocks completing the current GlycineMine build.

adf-ncgr commented 1 year ago

I'll take a look. I am not liking the negative coordinates appearing in your snippet.

adf-ncgr commented 1 year ago

not sure why those coordinates looked negative in your log output, they were fine. But there was a sorting issue due to some transcript types that the sorting script wasn't accounting for and placing them before their Parent genes. Hopefully fixed now.

sammyjava commented 1 year ago

not sure why those coordinates looked negative in your log output, they were fine. But there was a sorting issue due to some transcript types that the sorting script wasn't accounting for and placing them before their Parent genes. Hopefully fixed now.

Weird. That's BioJava featureI.toString() doing that. Strange.