Closed gregorydonahue closed 5 months ago
You need to set the batch size to be smaller. You can set the batch size separately for each step in the JSONs. Internally, all sequences initially live on the CPU and are moved to the GPU in batches and the results are moved back to the CPU after calculations are done. The number of sequences doesn't matter -- just the number that are moved over to the GPU at a time.
Please re-open if issues persist.
Hi,
I'm running chrombpnet on some ATAC-seq data, and I was able to train the model without issue (chrombpnet fit) and also run predictions (chrombpnet predict). However, the following step (chrombpnet attribute) - which I gather calculates the SHAP scores and maybe runs TF-MoDISco - fails with the following:
...the relevant bit being the torch.cuda.OutOfMemoryError. As you can see, I'm loading ~100k ATAC-seq OCRs and then chrombpnet tries to load > 2 million something-or-others. This may be too much...I have tried setting the PYTORCH_CUDA_ALLOC_CONF environment variable to values higher or lower than the requested 530 MB, but nothing works. Am I just out of luck here? I also tried editing the JSON to restrict the interpret 'chroms' parameter to just chr10, thinking that this might limit the loaded sequences, but that also failed (same error).
Best, Greg