language-brainscore / langbrainscore

[Marked for Deprecation. please visit https://github.com/brain-score/language for the migrated project] Benchmarking of Language Models using Human Neural and Behavioral experiment data
https://language-brainscore.github.io/langbrainscore/
MIT License
4 stars 1 forks source link

implement zarr-based caching for major classes #28

Open aalok-sathe opened 2 years ago

aalok-sathe commented 2 years ago

we need reliable state-caching for most classes to persist results to the disk, for later analysis and reuse in pipelines. if cached results exist, they may be reused based on a flag (e.g. overwrite_cache=False)

aalok-sathe commented 2 years ago

proposal: make the __repr__ method of each Cacheable class uniquely identify that instance. E.g., the repr(BrainScore()) should contain information about Mapping, Metric, and the encoders (all this can come from respective calls to the repr methods of these objects)

below list is in the form:

aalok-sathe commented 2 years ago

zarr is unable to cache xarrays with dtype object in them. Somehow we're getting dtype object bleed in from somewhere. Once that is corrected to string, this issue disappears. This issue is referenced here: https://github.com/pydata/xarray/issues/3476 It is partially sovled by commits in #34