monarch-initiative / mondo-ingest

Coordinating the mondo-ingest with external sources
https://monarch-initiative.github.io/mondo-ingest/
6 stars 3 forks source link

Bug: `match-mondo-sources-all-lexical.py`: Unknown `Synonymizer` rules #519

Closed joeflack4 closed 1 month ago

joeflack4 commented 1 month ago

Overview

It looks like when oaklib got updated recently, the mapping-rules-datamodel got updated and is causing breaking changes.

The error

ValueError: Unknown argument: the_rule = 'Remove box brackets bound info from the

Log

``` python ../scripts/match-mondo-sources-all-lexical.py run tmp/merged.db \ -c metadata/mondo.sssom.config.yml \ -r config/mondo-match-rules.yaml \ --rejects ../mappings/rejected-mappings.tsv \ -o ../mappings/mondo-sources-all-lexical.sssom.tsv /usr/local/lib/python3.10/dist-packages/sssom/util.py:168: FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed in a future version. To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)` df.replace("", np.nan, inplace=True) Traceback (most recent call last): File "/work/src/ontology/../scripts/match-mondo-sources-all-lexical.py", line 151, in main() File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__ return self.main(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) File "/work/src/ontology/../scripts/match-mondo-sources-all-lexical.py", line 116, in run ruleset = load_mapping_rules(rules) File "/usr/local/lib/python3.10/dist-packages/oaklib/utilities/lexical/lexical_indexer.py", line 518, in load_mapping_rules return yaml_loader.load(path, target_class=MappingRuleCollection) File "/usr/local/lib/python3.10/dist-packages/linkml_runtime/loaders/loader_root.py", line 76, in load results = self.load_any(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/linkml_runtime/loaders/yaml_loader.py", line 42, in load_any return self._construct_target_class(data_as_dict, target_class) File "/usr/local/lib/python3.10/dist-packages/linkml_runtime/loaders/loader_root.py", line 137, in _construct_target_class return target_class(**data_as_dict) File "", line 5, in __init__ File "/usr/local/lib/python3.10/dist-packages/oaklib/datamodels/mapping_rules_datamodel.py", line 86, in __post_init__ self.rules = [v if isinstance(v, MappingRule) else MappingRule(**as_dict(v)) for v in self.rules] File "/usr/local/lib/python3.10/dist-packages/oaklib/datamodels/mapping_rules_datamodel.py", line 86, in self.rules = [v if isinstance(v, MappingRule) else MappingRule(**as_dict(v)) for v in self.rules] File "", line 8, in __init__ File "/usr/local/lib/python3.10/dist-packages/oaklib/datamodels/mapping_rules_datamodel.py", line 126, in __post_init__ self.synonymizer = Synonymizer(**as_dict(self.synonymizer)) File "", line 11, in __init__ File "/usr/local/lib/python3.10/dist-packages/oaklib/datamodels/mapping_rules_datamodel.py", line 435, in __post_init__ super().__post_init__(**kwargs) File "/usr/local/lib/python3.10/dist-packages/linkml_runtime/utils/yamlutils.py", line 53, in __post_init__ raise ValueError('\n'.join(messages)) ValueError: Unknown argument: the_rule = 'Remove box brackets bound info from the make[1]: *** [mondo-ingest.Makefile:439: ../mappings/mondo-sources-all-lexical.sssom.tsv] Error 1 ```

Possible solutions

a. Comment out / remove all Synonymizer rules This allows the script to proceed without error, but I don't know how the results would be affected. I imagine these rules are there for good reason. b. Update Synonymizer rules I looked very briefly through mapping_rules_datamodel.yaml in oaklib, but I'm not sure yet what I'd need to do. I was expecting to see that the rule descriptions changed only slightly, but at a glance, I'm not seeing anything quite like the rules in mondo-ingest's mondo-match-rules.yaml

Additional information

I was able to run lexmatch successfully in a debug environment using an older version of oaklib==0.5.25, but after to use the same version as the ODK dev container oaklib==0.6.5, I got the same error that I was seeing when doing the build I'm trying to run in ODK.

Here are the Synonymizer rules in mondo-match-rules.yaml in mondo-ingest: https://github.com/monarch-initiative/mondo-ingest/blob/e093456bdb193f609b71e3ba5ce7baab3834fcd5/src/ontology/config/mondo-match-rules.yaml#L65-L112

And here is the mapping_rules_datamodel.yaml in OAK.

During my troubleshooting, I verified that each one of the rules in the mondo-ingest file is invalid and will throw an error.

joeflack4 commented 1 month ago

@cmungall Can you take a look at this and let me know if you have any ideas? I'm unfamiliar with this part of things, and Harshad is going to be out for a while.

matentzn commented 1 month ago

Are you sure you have updated from master branch? I have fixed this two weeks ago:

https://github.com/monarch-initiative/mondo-ingest/commit/29649b89d5b1e467284a0328afa869c4d08d0f67

twhetzel commented 1 month ago

Yes, this was fixed already and I ran the branch this was fixed in to verify.

joeflack4 commented 1 month ago

@twhetzel @matentzn I have been doing regular pulls from main and merges into develop about two times a week. And usually I do frequent rebases to my feature branches as well, but not always. It looks like in this case this is an older feature branch that hasn't been updated. I'll give that a go!

I'll probably just rebase/merge into all my open PRs whenever I do this main -> develop routine. But don't forget to tag me if there's something I need to review, I don't know if when you fix this it was something I really should have reviewed or not but I haven't been receiving a lot of review requests recently. Also, I think that for mondo-ingest, I'll want to know about every bug fix or feature addition.