Sorry to bother you again. I restarted the analysis on a slurm cluster with a bash script like this:
#SBATCH -t 3-23:59:59
# Define partition
#SBATCH --partition=long
# Set number of nodes to run
#SBATCH --nodes=1
# Set number of cpus
#SBATCH -c 16
# Set memory
#SBATCH --mem=128G
# Define email for script execution
#SBATCH --mail-user=loic.talignani@ird.fr
# Define type notifications
#SBATCH --mail-type=ALL
###################################################################
echo "Load module"
module purge
module load r/4.3.1
module load bcftools/1.15.1
echo "run Local PCA for 5kb windows"
Rscript --vanilla run_lostruct.R -i data -t bp -s 5000 -I data/sample_info.tsv > lostruct-${SLURM_JOB_ID}.Rout 2>&1
As you can see, I used -t bp and -s 5000options. I haven't had the error described below before using the -t snp -s 1000 options. The run_lostruct.R script remains unchanged.
After 15 hours, the job stops with the error message:
Error in cmdscale(pc.distmat[!na.inds, !na.inds], k = opt$nmds) :
NA values not allowed in 'd'
Calls: cbind -> cbind -> data.frame -> cmdscale
Execution halted
It seems to me, however, that the NAs are managed by the script and that the MDS calculation is performed without them, no? Where do you think the problem comes from?
Dear Peter,
Sorry to bother you again. I restarted the analysis on a slurm cluster with a bash script like this:
As you can see, I used
-t bp
and-s 5000
options. I haven't had the error described below before using the-t snp -s 1000
options. Therun_lostruct.R
script remains unchanged.After 15 hours, the job stops with the error message:
It seems to me, however, that the NAs are managed by the script and that the MDS calculation is performed without them, no? Where do you think the problem comes from?
Here's a link to download the
*.pca.csv
andregions.csv
files, as well as theconfig.json
file: https://filesender.renater.fr/?s=download&token=d83c40ce-8178-4fff-9d89-9cde1b3d3b2aThanks in advance for your help.
Best regards.