Closed dbkeator closed 5 years ago
I agree strongly on the slowness aspect. But I don't think it is feasible to store the definitions locally - such an approach wouldn't scale, users would need plenty of csv files, and such csv files could get outdated. But in principle, the interlex definition in the graph isn't necessary per se, correct? It improved human-readability, but machine-readibility is given in any case, because of the IRIs to interlex. I'd propose a conditional parameter to prevent scraping (and thus not including definitions). What do you think?
One thing I don't like about this code is the dependency on live InterLex scraping of anatomy CDE properties. If one has a slow internet connection then this can take quite a while, if one doesn't have internet connectivity then the code can't be run, and if the InterLex is down then it can't be run. (see function: https://github.com/dbkeator/segstats_jsonld/blob/master/segstats_jsonld/fs_to_nidm.py#L275)
One idea is to store the anatomy CDE details for each FS stats file.