openvax / pyensembl

Python interface to access reference genome features (such as genes, transcripts, and exons) from Ensembl
Apache License 2.0
365 stars 66 forks source link

Cache directories created for all releases, not just requested release #241

Closed dhimmel closed 3 years ago

dhimmel commented 3 years ago

When I run the following:

pyensembl install --release=100 --species=human

I get a cache directory for every ensembl release:

/home/jovyan/work/data/.data_source_cache/pyensembl/GRCh37:
total 92K
drwxr-sr-x 23 jovyan users 4.0K Aug 21 19:35 ./
drwxr-sr-x  5 jovyan users 4.0K Aug 21 19:35 ../
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl55/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl56/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl57/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl58/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl59/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl60/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl61/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl62/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl63/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl64/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl65/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl66/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl67/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl68/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl69/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl70/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl71/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl72/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl73/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl74/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl75/

/home/jovyan/work/data/.data_source_cache/pyensembl/GRCh38:
total 108K
drwxr-sr-x 27 jovyan users 4.0K Aug 21 19:35 ./
drwxr-sr-x  5 jovyan users 4.0K Aug 21 19:35 ../
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:40 ensembl100/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl76/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl77/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl78/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl79/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl80/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl81/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl82/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl83/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl84/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl85/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl86/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl87/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl88/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl89/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl90/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl91/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl92/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl93/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl94/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl95/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl96/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl97/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl98/
drwxr-sr-x  2 jovyan users 4.0K Aug 21 19:35 ensembl99/

All of these directories are empty besides pyensembl/GRCh38/ensembl100. I think it's ideal to avoid cluttering the file system and only creating these directories when they are first needed.

iskandr commented 3 years ago

Yikes:

ensembl100/  ensembl77/  ensembl79/  ensembl81/  ensembl83/  ensembl85/  ensembl87/  ensembl89/  ensembl91/  ensembl93/  ensembl95/  ensembl97/  ensembl99/
ensembl76/   ensembl78/  ensembl80/  ensembl82/  ensembl84/  ensembl86/  ensembl88/  ensembl90/  ensembl92/  ensembl94/  ensembl96/  ensembl98/

I'll look into this.

iskandr commented 3 years ago

OK, should be fixed :

ensembl100  ensembl77  ensembl93