geneontology / go-site

A collection of metadata, tools, and files associated with the Gene Ontology public web presence.
http://geneontology.org
BSD 3-Clause "New" or "Revised" License
45 stars 89 forks source link

GAF validation reports invalid number of columns #174

Open sabrinatoro opened 8 years ago

sabrinatoro commented 8 years ago

Hello, ZFIN updated their GAF file to support the "AND/OR" logic (comma/pipe) in column 8. Since this update, our GAF validation checks have been unstable. See the last one here: http://build.berkeleybop.org/job/gaf-check-zfin/128/

The gaf-validation-report.txt reports WARNING : Got invalid number of columns (expected 15, got 17)

Is this a bug in the GAF checking process related to our updated GAF version? Or is there actually a problem with our GAF file?

Thank you very much Sabrina

sabrinatoro commented 8 years ago

Hello, We are still getting the "invalid number of columns (expected 15, got 17) warning in our gaf-validation-report. The latest one: [http://build.berkeleybop.org/job/gaf-check-zfin/134/]

Could someone have a look at it and tell us if there is an actual problem with out GAF file, please? Thank you! Sabrina

CC: @cmungall

kltm commented 8 years ago

I seem to recall another open ticket about this (incorrect format version identification)--I'll take a look around for it. I've assigned @cmungall for further comment when he gets back.

kltm commented 8 years ago

I looked in my notes, and found a the following comment:

   18:49:30,958 WARN  owltools.gaf.GAFParser.loadNext(GAFParser.java:171)  - Got invalid number of columns for row (expected 15, got 17). The '4245' row is ignored. : MGI  MGI:2685243 Zscan4c     GO:0045950  MGI:MGI:4440771|PMID:20336070   IMP     P   zinc finger and SCAN domain containing 4C   LOC245109   protein taxon:10090 20111115    MGI regulates_o_occurs_in(CL:0002322)   VEGA:OTTMUSP00000044032

IIRC, we decided it was harmless (for my use case) and that we should look into better warning suppression in some cases. This may be a place to start for getting the reports more reasonable.

hdrabkin commented 8 years ago

Just a thought: do you declare that this is a GAF 2.1? (line 1, !gaf-version: 2.1) GAF 1.0 specifies 15 columns rather than 17, so is this checker thinking this is a GAF 1.0?

sabrinatoro commented 8 years ago

Thank you all! @hdrabkin : yes, we do declare that it is a GAF 2.1 file. line 1 : !gaf_version: 2.1

@kltm : I thought it should be harmless too (especially since it is 'only' a warning. But I wanted to make sure that this is not creating other problems which I am not aware.

Thanks again!

kltm commented 8 years ago

Even if harmless, it does cause rather spectacular log files. We should aim to suppress unneeded warnings no matter.