Problem: changes to Travis seem to have limited us to <4gb / 19gb of disk space. Our unit tests use significantly more than that while installing multiple Ensembl releases.
Solution:
Tell Travis to install fewer things using language: minimal in .travis.yml.
Don't install scipy or matplotlib
Don't install Ensembl releases 54 or 83
Don't index every table on ['seqname', 'start', 'end', 'strand'], since we're already doing ['seqname', 'start', 'end'] and if there are query results on both strands they can probably be efficiently filtered.
Got rid of GTF object, which existed to cache query results on disk as CSV files. This slightly improved long-term performance at the cost of massive disk usage.
During my hunt for disk savings I also added a delete-source-files command to the pyensembl CLI which allows you delete GTF and FASTA files while leaving their indexed representations in place.
Problem: changes to Travis seem to have limited us to <4gb / 19gb of disk space. Our unit tests use significantly more than that while installing multiple Ensembl releases.
Solution:
language: minimal
in.travis.yml
.scipy
ormatplotlib
['seqname', 'start', 'end', 'strand']
, since we're already doing['seqname', 'start', 'end']
and if there are query results on both strands they can probably be efficiently filtered.GTF
object, which existed to cache query results on disk as CSV files. This slightly improved long-term performance at the cost of massive disk usage.During my hunt for disk savings I also added a
delete-source-files
command to thepyensembl
CLI which allows you deleteGTF
andFASTA
files while leaving their indexed representations in place.