Open joeflack4 opened 11 months ago
Thank you @joeflack4 this is a high priority for me! @hrshdhgd how can we get this up the ladder of priorities?
https://github.com/INCATools/ontology-access-kit/pull/679: PR on the way!
The latest version - v0.5.22 should be quicker!
I am trying this on 0.6.6 and it is still much too slow..
for curie in all_descendants:
metadata_map = adapter.entity_metadata_map(curie)
if "oio:inSubset" in metadata_map:
list_of_subsets = metadata_map["oio:inSubset"]
for subset in list_of_subsets:
row = {
"id": curie,
"subset": subset
}
data_subsets.append(row)
I stopped this after 16 minutes..
@matentzn and others, I'm not sure if the code for .entity_metadata_map()
is similar to .relationships_metadata()
, but if so, it might be worth looking at this PR #659 where Chris and I discuss a few different performance refactoring options for that method.
Overview
This method seems far too slow.
Example
In this PR, using
mondo.db
:Each iteration of this loop took on average 15.5 seconds. Estimating that to get through ~25k mondo_ids would take ~100 hours at this rate.