bioperl / bioperl-live

Core BioPerl 1.x code
http://bioperl.org
299 stars 182 forks source link

Bio::Tools::GFF add extra empty attribute problematic #388

Open Juke34 opened 7 months ago

Juke34 commented 7 months ago

When reading such feature: chr10p ambMex60DD gene 313039 315424 1000 + . gene_id "AMEX60DD000001"; gene_name "ZFP37 [nr]|ZNF568 [hs]";

with GFF parser set to 2 or 2.5 it creates this object:

$VAR1 = bless( {
                 '_primary_tag' => 'gene',
                 '_root_cleanup_methods' => [
                                              sub { "DUMMY" }
                                            ],
                 '_parse_h' => {},
                 '_gsf_seq_id' => 'chr10p',
                 '_gsf_tag_hash' => {
                                      'score' => [
                                                   '1000'
                                                 ],
                                      'gene_name' => [
                                                       'ZFP37 [nr]|ZNF568 [hs]'
                                                     ],
                                      'ID' => [
                                                ' '
                                              ],
                                      'gene_id' => [
                                                     'AMEX60DD000001'
                                                   ]
                                    },
                 '_location' => bless( {
                                         '_end' => '315424',
                                         '_start' => '313039',
                                         '_location_type' => 'EXACT',
                                         '_strand' => 1
                                       }, 'Bio::Location::Simple' ),
                 '_gsf_frame' => '.',
                 '_source_tag' => 'ambMex60DD'
               }, 'Bio::SeqFeature::Generic' );

this part is problematic

                                      'ID' => [
                                                ' '
                                              ],

and shouldn't appear as is ti absent from the GFF/GTF feature provided as input

cjfields commented 6 months ago

@Juke34 we can address this but can't commit to a timeline. If you know the specific fix we also (gladly!) accept pull requests.