I'm having trouble getting scanorama to run in a reasonable time on my computer using the R interface. I realize this is a little outside the scope of scanorama itself, but wanted to see if anyone else has had this issue and was able to figure it out. I followed the instructions from this issue/discussion thread and have the following:
I tried to follow the instructions to check that numpy is using multiple cores (found here), but my python installation does not seem to correspond with those instructions. I can't find the dist-packages to check that I'm linked to OpenBLAS.
which python
/home/lab/miniconda3/bin/python
python --version
Python 3.7.10
I made the test.py script and ran it and it does appear that 100% of my CPU is being used, which would indicate that I'm not using the parallelization functionality. I also checked my CPU usage when running the scanorama$correct call and saw the same thing.
I tried using the future package with plan("multisession") (found here) and still only have one process running.
I don't see how I can use foreach or one of the other parallel R calls because there's nothing to apply over - just the one call to scanorama$correct
Hello,
I'm having trouble getting scanorama to run in a reasonable time on my computer using the R interface. I realize this is a little outside the scope of scanorama itself, but wanted to see if anyone else has had this issue and was able to figure it out. I followed the instructions from this issue/discussion thread and have the following:
This takes forever. (I'm basing this off the 15 minute estimate here). I killed the process about 45 minutes in and this was the output so far:
I'm trying to integrate 3 scRNAseq datasets which 16,340 cells, 25,981 cells, and 68,433 cells :
I tried to follow the instructions to check that numpy is using multiple cores (found here), but my python installation does not seem to correspond with those instructions. I can't find the
dist-packages
to check that I'm linked to OpenBLAS.I made the
test.py
script and ran it and it does appear that 100% of my CPU is being used, which would indicate that I'm not using the parallelization functionality. I also checked my CPU usage when running thescanorama$correct
call and saw the same thing.I tried using the future package with
plan("multisession")
(found here) and still only have one process running.I don't see how I can use
foreach
or one of the other parallel R calls because there's nothing to apply over - just the one call toscanorama$correct
Thanks!