Open myoshida0215 opened 1 week ago
Thank you for sharing the details of the error with Ginger (v1.0.1). From the traceback, it seems that the issue is related to how the GFF attributes are being parsed, particularly where key-value pairs are expected but not found.
Could you kindly share a portion of the GFF file generated by Ginger that triggered this error? This would help us better understand the cause and provide more specific guidance.
Yes, these are our files that are giving the error. test.1.gff.txt test.1.fa.txt
The error in the GFF file seems to be caused by the use of semicolons (;) within the ID field, which the parser interprets as a delimiter for separating attributes. (e.g., ID=mRNA_1;Aargo017135; )According to the GFF specification, semicolons should be used exclusively to separate key=value pairs in the attributes field. To resolve this, the problematic semicolons within the ID field can be replaced with another symbol, such as a colon (:), to prevent misinterpretation.
It seems that semicolons (;) are improperly used not only within the ID field but also in other key-value pairs within the attributes field. Based on your example, keys like Note, gene, and potentially others also suffer from this issue. To address this comprehensively, the goal is to ensure that all semicolons within values (and not between key-value pairs) are replaced with a different delimiter, such as a colon (:).
This oneliner command may work with your gff file
sed -E 's/(ID|Parent|Note|gene)=([^;]+);([^;]+)/\1=\2:\3/g' input.gff > output.gff
Dear provider,
When I tried using Ginger (v1.0.1) output gff, that gave us the following error. https://academic.oup.com/dnaresearch/article/30/4/dsad017/7227702
Could you suggest what might be the cause?
Traceback (most recent call last): File "/Users/yoshidamasaaki/Documents/Data/PAGS2023/2023.10.16/GFF2MSS-master/GFF2MSS.py", line 558, in
gff_df_col = gff_df.attributes_to_columns()
File "/Users/yoshidamasaaki/Documents/Data/PAGS2023/2023.10.16/GFF2MSS-master/MSS/lib/python3.9/site-packages/gffpandas/gffpandas.py", line 132, in attributes_to_columns
attribute_df['at_dic'] = attribute_df.attributes.apply(
File "/Users/yoshidamasaaki/Documents/Data/PAGS2023/2023.10.16/GFF2MSS-master/MSS/lib/python3.9/site-packages/pandas/core/series.py", line 4917, in apply
return SeriesApply(
File "/Users/yoshidamasaaki/Documents/Data/PAGS2023/2023.10.16/GFF2MSS-master/MSS/lib/python3.9/site-packages/pandas/core/apply.py", line 1427, in apply
return self.apply_standard()
File "/Users/yoshidamasaaki/Documents/Data/PAGS2023/2023.10.16/GFF2MSS-master/MSS/lib/python3.9/site-packages/pandas/core/apply.py", line 1507, in apply_standard
mapped = obj._map_values(
File "/Users/yoshidamasaaki/Documents/Data/PAGS2023/2023.10.16/GFF2MSS-master/MSS/lib/python3.9/site-packages/pandas/core/base.py", line 921, in _map_values
return algorithms.map_array(arr, mapper, na_action=na_action, convert=convert)
File "/Users/yoshidamasaaki/Documents/Data/PAGS2023/2023.10.16/GFF2MSS-master/MSS/lib/python3.9/site-packages/pandas/core/algorithms.py", line 1743, in map_array
return lib.map_infer(values, mapper, convert=convert)
File "lib.pyx", line 2972, in pandas._libs.lib.map_infer
File "/Users/yoshidamasaaki/Documents/Data/PAGS2023/2023.10.16/GFF2MSS-master/MSS/lib/python3.9/site-packages/gffpandas/gffpandas.py", line 133, in
lambda attributes: dict([key_value_pair.split('=') for
ValueError: dictionary update sequence element #1 has length 1; 2 is required