explodecomputer / epiGPU

Search for epistasis, accelerated using graphics cards
2 stars 1 forks source link

any workaround for converting large datasets? #1

Open shadiakiki1986 opened 4 years ago

shadiakiki1986 commented 4 years ago

Hello. I'm trying out epiGPU. I see that you have a note on the readme:

There are problems converting larger datasets, i.e. above 4000 individuals and 600,000 SNPs.

I faced the issue when running epiGPU on a file with 906k snps and 500 samples.

Is there any workaround that you can suggest for the conversion step?

explodecomputer commented 4 years ago

Thanks for the message. I stopped maintaining this package several years ago and will not have much capacity to revisit it any time soon unfortunately. In general the IO side of it was quite complicated because the algorithm for performing the tests on the GPU were organised into warps of specific sizes, and larger sample sizes pose a problem here. The data management side was a bit rudimentary and could be improved by using more modern data formats and internal structures.