interpret saves h5 files extremely slowly

kundajelab / basepairmodels

MIT License

16 stars 6 forks source link

interpret saves h5 files extremely slowly #13

Closed mmtrebuchet closed 3 years ago

mmtrebuchet commented 3 years ago

For a run of about 12k regions, each 1 kb wide, running the interpret script takes about as long to run through the regions and generate scores as it does to save those scores (about 1 GB) to disk. Since this is a single call to DeepDish, I wonder if this could be optimized in some way to write those files much faster.

mmtrebuchet commented 3 years ago

(It takes about half an hour to write the h5 files on my system)

zahoorz commented 3 years ago

use the shap_scores script https://github.com/kundajelab/basepairmodels/blob/master/basepairmodels/cli/shap_scores.py