biolink / biolink-model-toolkit

A collection of useful python functions for looking up information and working with the Biolink Model
https://biolink.github.io/biolink-model-toolkit/
BSD 3-Clause "New" or "Revised" License
20 stars 10 forks source link

Toolkit fails to populate infores_map because of mismatching number of fields #129

Closed stimon closed 11 months ago

stimon commented 1 year ago

I'm using KGX, and when trying to instantiate a new NxGraph object, it throws an IndexError: list index out of range because the Toolkit class fails to populate the infores_map.

The errors occurs at line 70, because infores_catalog_nodes.tsv line 52 field length is < 6. I checked, and line 206 also has a shorter length than expected, but it doesn't reach that point.

I guess fixing infores_catalog_nodes.tsv should solve this problem, but adding a field length check would also make sense.

Best, Santi

sierra-moxon commented 1 year ago

Hi @stimon - good catch! We did release a version of BMT on Friday last week that handles the change in field length in the infores catalog, https://github.com/biolink/biolink-model-toolkit/releases/tag/v1.0.13. Would it be possible for you to try and upgrade?

In our partner repo, https://github.com/biolink/biolink-model/ we also added a validation script for the infores catalog in version https://github.com/biolink/biolink-model/releases/tag/v3.3.2.

stimon commented 1 year ago

Hi @sierra-moxon, sorry I haven't been able to look at this in a while.

I was already in v1.013. Maybe the problem is that those lines from infores_catalog_nodes should have all the expected fields (missing separators?).

I barely have the context to do much else, but this naive workaround at least lets the toolkit parse the lines.

if len(line) >= 6:
                self.infores_map[line[2]] = {
                    "status": line[0],
                    "name": line[1],
                    "url": line[3],
                    "synonyms": line[4],
                    "description": line[5],
                }
            else:
                self.infores_map[line[2]] = {
                    "status": line[0],
                    "name": line[1],
                    "url": line[3],
                    "synonyms": line[4],
                    "description": "",
                }
sierra-moxon commented 1 year ago

removed infores dependency from bmt