Open jshoyer opened 4 years ago
Dear jshoye,
Thank you very much for sharing your table, and would like to ask you a few questions. How long did it take you to calculate that table? Because I have tried to calculate mine for 250 (500 sequences/chromosomes, 2.6 million configurations) animals but it is too slow, and the analysis only occupies 2% of a server with 150 GB of RAM, and 32 cores, so you can imagine that It would take him 2 years to calculate this table, and well my goal would be to have a table for 1000 animals since I have at least 13000 genotypes, is there any advice you can give me? Is there a way to allocate more memory and cores to the calculation to speed up the process? thanks for your answer
I used both Slurm job arrays and GNU parallel to parallelize the computations with the --splits flag to ldhat complete. See the job script that I included in the Zenodo record: https://zenodo.org/record/3934350/files/ldhat-complete-n320-t0.01-split10000fold.sbatch
The job will still take quite a while with 32 CPU cores. I used hundreds of cores.
In case anyone is interested, I created a new likelihood lookup table for n = 320 sequences/chromosomes -- see https://zenodo.org/record/3934350 That seemed sufficiently large and computationally expensive to make sharing worthwhile. I would have created a pull request, but the table is too large for GitHub (207.5 MB compressed, 806.8 MB uncompressed), and centralized distribution of the tables via Git is not disk-space-efficient anyway. Ideas for helping people discover the file would be welcome. Feel free to close this issue whenever.