zkutalik / ssimp_software

GNU General Public License v3.0
16 stars 10 forks source link

Test.with.less.memory #84

Closed aaronmcdaid closed 6 years ago

aaronmcdaid commented 6 years ago

This PR speeds up the tests a lot (although it's an ugly solution!).

The build database takes up approximately 7 GB of RAM. Most of our tests only use a subset of the chromosomes, so in this PR we allow that only a subset of the build database is loaded; and we do this by specifying which chromosomes to load from the build database. Most tests use only up to three chromosomes, and therefore this means we can cut the memory usage down from 7.0GB to between 0.5 and 2.0 GB. Also, some of our tests did use all 22 chromosomes, but I changed those such that no test uses any more than three chromosomes.

In itself, this doesn't make a huge speed improvement, but the memory savings mean that multiple tests can easily be run in parallel even on a normal laptop. My laptop has four processors, so I can do stu -j4 @all.tests to complete all the test in 15 minutes where previously I think it took at least an hour.

Details:

sinarueeger commented 6 years ago

@aaronmcdaid looks good!

Option _debug_build_chr does this mean we could use this instead (or together) with the option --impute.range, e.g. ... --impute.range=22 --debug_build_chr 22?

Improvements regarding option N_max looks fine.

sinarueeger commented 6 years ago

Btw - I am now using SSIMP on our HPC. The --download.1KG option works fine!

That also means that I can start running tests with stu/stu @all.tests there or stu/stu -j4 @all.tests.

aaronmcdaid commented 6 years ago

On hpc1, perhaps you could use more than 4. I forget how many CPUs it has. Is it just four?

... does this mean we could use this instead (or together) ...

Yes, this can now be simplified a lot. If the user specifies --impute.range or --tag.range, then ssimp should identify the set of chromosomes and then load only those. Once that works, we can delete the _debug_build_chr option.

I only added _debug_build_chr in order to have something quick and dirty for the testing. I hadn't really planned it well, and it's good you noticed that it can be simplified and improved!

sinarueeger commented 6 years ago

HPC1 has many more CPUs I believe. But I am using different HPC now.