Closed AnnabelPerry closed 3 years ago
I really like slurm. I used it ~ 15 years ago when we did not really have multi-core systems and needed slurm for parallel work.
So these days I presume you use Slurm possibly even without MPI "just" as a resource managers? Or also for explicit parallelism (which is one to two levels harder to code...).
I can't say that I know what MPI is - I've seen it on the supercomputer help page in relation to SLURM, but they don't include information on what MPI is. I know we can use slurm on our supercomputers to submit parallel jobs, but I don't really know what that is either.
(Yup. MPI is one very Comp-Sciency concept for parallel computing. No need to worry now. Back in the day use slurm as front-end to MPI jobs but you can of course use it "just" to launch jobs in batch and have manage the resources on the big computer -- that is slurm's job.)
I'm battling more supercomputer demons this morning - I'm getting segmentation faults when I run jobs to collect runtime info from the MicroGenotyper() function:
*** caught segfault ***
address 0x202568ca527, cause 'memory not mapped'
Traceback:
1: MicroGenotyper(bams, "/scratch/user/annabelperry/PollyRuntimes/InputFiles/Edited_Birch_Lookup_Table.csv", scaffold_vector, output_names)
An irrecoverable exception occurred. R is aborting now ...
/sw/hprc/sw/R_tamu/bin/Rscript: line 75: 102073 Segmentation fault (core dumped) ${EBROOTR}/bin/Rscript ${ARGS[@]}
rm: cannot remove 'aligned_SRR6511793.bam': No such file or directory
When I run the seff
command on the slurm jobs, it shows the job has not used all the requested memory:
[annabelperry@grace1 5]$ seff 719533
Job ID: 719533
Cluster: grace
User/Group: annabelperry/annabelperry
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 80
CPU Utilized: 01:55:19
CPU Efficiency: 1.20% of 6-16:21:20 core-walltime
Job Wall-clock time: 02:00:16
Memory Utilized: 1.86 TB
Memory Efficiency: 65.04% of 2.86 TB
Since I am just trying to collect the script's runtime, I delete the input and output directly after running (to save space). The input (aligned_SRR6511793.bam) is in the /scratch/user/annabelperry/PollyRuntimes/InputFiles/ directory, as shown in the R script, but when I called rm in the job script I forgot to include the full directory, so that's why you see a "no such file or directory" error. This is not the cause of the segmentation fault, though, because the segmentation fault occurs while the R script is running, and I use the correct directory in the R script itself (see below)
library("Polly")
setwd("/scratch/user/annabelperry/PollyRuntimes/MicroGenotyper")
scaffold_vector <- c("ScyDAA6_1508_HRSCAF_1794", "ScyDAA6_1196_HRSCAF_1406",
"ScyDAA6_5987_HRSCAF_6712", "ScyDAA6_8_HRSCAF_51",
"ScyDAA6_1107_HRSCAF_1306", "ScyDAA6_2393_HRSCAF_2888",
"ScyDAA6_1592_HRSCAF_1896", "ScyDAA6_1439_HRSCAF_1708",
"ScyDAA6_1854_HRSCAF_2213", "ScyDAA6_10_HRSCAF_60",
"ScyDAA6_11_HRSCAF_73", "ScyDAA6_695_HRSCAF_847",
"ScyDAA6_1934_HRSCAF_2318", "ScyDAA6_5078_HRSCAF_5686",
"ScyDAA6_5984_HRSCAF_6694", "ScyDAA6_2469_HRSCAF_2980",
"ScyDAA6_1473_HRSCAF_1750", "ScyDAA6_5983_HRSCAF_6649",
"ScyDAA6_1859_HRSCAF_2221", "ScyDAA6_2_HRSCAF_26",
"ScyDAA6_7_HRSCAF_50", "ScyDAA6_2113_HRSCAF_2539",
"ScyDAA6_2188_HRSCAF_2635", "ScyDAA6_932_HRSCAF_1100")
bams <- c("/scratch/user/annabelperry/PollyRuntimes/InputFiles/aligned_SRR6511793.bam")
output_names <- c("MGR-F4.csv")
ptm <- proc.time()
MicroGenotyper(bams,"/scratch/user/annabelperry/PollyRuntimes/InputFiles/Edited_Birch_Lookup_Table.csv",scaffold_vector,output_names)
MicroGenotyperRunTime <- proc.time() - ptm
print("\nRuntime for Microgenotyper on Bam File 4: ")
print(MicroGenotyperRunTime)
The sysadmins are shutting Grace down for maintenance all day tomorrow. Hopefully this is one of the issues they're going to fix.
Sounds like you want to open a new issue for a new topic and delete this one here? That's exactly what you were thinking isn't it? ;-)
Yes haha - I'll do that now
Ok, after looking into this a bit more, I think RStudio and command-line R have two distinct issues.
RStudio seems to be wiping the HTSlib Makevars flags and is thus unable to find HTSlib. Command-line R seems able to find HTSlib but unable to install Polly (maybe I don't have permission to install to Grace..?).
Here's my rationale for this diagnosis:
Issues with Command-Line R:
2. When I run "R CMD check" on the tar.gz file generated using "R CMD build Polly", I get this error:
Since the dependency check passes successfully, I think command-line R can "find" HTSlib.
When I run "R CMD INSTALL", I don't get any output at all. This makes me think that something is amiss with my permission to install to Grace
As an aside, when I untar the tar.gz file, I see all the expected files:
One possible explanation is that there is an issue with the R/4.0.3 module installed on Grace. I noticed an error associated with "mkl-static-ilp64-iomp.pc" in the output of "pkg-config --list-all". Since R/4.1.0 relies upon this version of impi, I went ahead and tried to fix it using the following command I found on StackOverflow:
However, this command did not fix the error, telling me that something is wrong with my ability to execute commands from the Grace command line.
Issues with RStudio:
I only see an "htslib was not found" error if I check my package using "devtools::check()" from RStudio. Additionally, when I run "devtools::check()" from RStudio, the HTS_CFLAGS and HTS_LIBS are wiped from the "src/Makevars" file I generated using "./configure". This does not happen if I run "R CMD check". So, RStudio (for whatever reason) wipes the "src/Makevars" flags and that's why it can't find htslib.
I don't really care if the issue with RStudio is resolved - I just need to be able to run my Polly commands using command-line R.