Error executing process > 'POPGEN48_SCALEPOPGEN:SCALEPOPGEN:RUN_ADMIXTURE:ADMIXTURE

eduardo-pizarro commented 6 months ago

Description of the bug

I have test the functionality with the command provided in the README: nextflow run scalepopgen/ \ -profile test,docker \ -qs 10

but I keep having an issue with admixture. I executed the same command but changing the -qs: nextflow run scalepopgen/ \ -profile test,docker \ -qs 2

but I had the same issue. I have run the command several times, and the only change I have seen is K number used in the admixture command. Also, I have deleted the work, test, and scalepopgen directories and tried all from cero, but the problem still the same I have no clue about what the problem is. Any help would be great. Cheers

Command used and terminal output

$ nextflow run scalepopgen/ -profile test,docker -qs 2

[-        ] process > POPGEN48_SCALEPOPGEN:SCALEPOPGEN:MULTIQC_GENETIC_STRUCTURE                                                         -
[b4/b87a63] NOTE: Process `POPGEN48_SCALEPOPGEN:SCALEPOPGEN:RUN_ADMIXTURE:ADMIXTURE (scalepopgen_ld_filtered_update_chrom_id)` terminated with an error exit status (139) -- Execution is retried (1)
[f2/4cdfd7] NOTE: Process `POPGEN48_SCALEPOPGEN:SCALEPOPGEN:RUN_ADMIXTURE:ADMIXTURE (scalepopgen_ld_filtered_update_chrom_id)` terminated with an error exit status (139) -- Execution is retried (1)
[28/ae9d00] NOTE: Process `POPGEN48_SCALEPOPGEN:SCALEPOPGEN:RUN_ADMIXTURE:ADMIXTURE (scalepopgen_ld_filtered_update_chrom_id)` terminated with an error exit status (139) -- Execution is retried (1)
[57/310939] NOTE: Process `POPGEN48_SCALEPOPGEN:SCALEPOPGEN:RUN_ADMIXTURE:ADMIXTURE (scalepopgen_ld_filtered_update_chrom_id)` terminated with an error exit status (139) -- Execution is retried (1)
[52/56f0f7] NOTE: Process `POPGEN48_SCALEPOPGEN:SCALEPOPGEN:RUN_ADMIXTURE:ADMIXTURE (scalepopgen_ld_filtered_update_chrom_id)` terminated with an error exit status (139) -- Execution is retried (1)
ERROR ~ Error executing process > 'POPGEN48_SCALEPOPGEN:SCALEPOPGEN:RUN_ADMIXTURE:ADMIXTURE (scalepopgen_ld_filtered_update_chrom_id)'

Caused by:
  Process `POPGEN48_SCALEPOPGEN:SCALEPOPGEN:RUN_ADMIXTURE:ADMIXTURE (scalepopgen_ld_filtered_update_chrom_id)` terminated with an error exit status (139)

Command executed:

  admixture \
      scalepopgen_ld_filtered_update_chrom_id.bed \
      3 \
      -j2 \
      --cv=5 >& 3.log

  cat <<-END_VERSIONS > versions.yml
  "POPGEN48_SCALEPOPGEN:SCALEPOPGEN:RUN_ADMIXTURE:ADMIXTURE":
      admixture: $(echo $(admixture 2>&1) | head -n 1 | grep -o "ADMIXTURE Version [0-9.]*" | sed 's/ADMIXTURE Version //' )
  END_VERSIONS

Command exit status:
  139

Command output:
  (empty)

Command error:
  .command.sh: line 6:    38 Segmentation fault      admixture scalepopgen_ld_filtered_update_chrom_id.bed 3 -j2 --cv=5 &>3.log

Work dir:
  /work/dir/work/8b/75d8eade6408937d0d1f397045f298

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

 -- Check '.nextflow.log' file for details

Relevant files

17042024.nextflow.log.txt scalepopgen.zip

System information

N E X T F L O W ~ version 23.10.1 Desktop Workstation Docker Ubuntu 22.04.4

NPogo commented 6 months ago

Hello! As far as I can see from your log file, 32 CPUs and 23.3 GB (13.8 GB) memory is available to you. The Admixture module is running with »process_high« resources, meaning with max 12 CPUs and 72 GB of memory per job, which is probably causing the problems in you case. You can check different process-specific resource requirements here: scalepopgen/conf/base.config and adjust them according to your local computing system. I would suggest to run the script with »process_low« and with 1 CPU. In the file scalepopgen/modules/nf-core/admixture/main.nf change the label »process_high« to »process_low« or dont change anything and just reduce the number of parallel jobs from 2 to 1, i.e. "-qs 1".

Please let us know if this solved the issue.

BioInf2305 commented 6 months ago

Hello, yeah I agree with @NPogo. From your log file, it appears that all the processes are completed (except Admixture) and this process requires large number of CPUs. Run with "-qs 1" as suggested by @NPogo, and let us know whether it worked.

Just for your info, I ran the following command on my local desktop after git cloning the repo and using docker

nextflow run scalepopgen/ -profile test,docker -resume -qs 1

and it successfully completed within ~30-45 minutes (see the log file attached).

In your run, the exit code is 139 and it could mean that the program was terminated because of some memory violation issues.

Please let us know if you still have the same issue.

test_docker_run.log

eduardo-pizarro commented 6 months ago

Hello! Thanks for the quick response. I did three different tests based on your suggestions:

Just changing to -qs 1
Changing to -qs 1 and the scalepopgen/conf/base.conf file (attached)
Changing to -qs 1, the scalepopgen/conf/base.conf file and the scalepopgen/modules/nf-core/admixture/main.nf file (attached)

For each case, the problem persisted. I also removed the folders and tried from a new cloned directory with the changes mentioned.

Im working in a Windows 10 workstation with WSL2 activated and Ubuntu 22.04.4. I have docker installed in Windows and linked to the WSL2 and working on Ubuntu.

In the zip file I attached the changes I did on the two files, and I aslo put the .nextflow.log file for the last run and the screen ouput.

files.zip

Cheers

eduardo-pizarro commented 6 months ago

I have tried the pipeline in a server, and I had an issue with Admixture:

Command error:
  Unable to find image 'quay.io/biocontainers/admixture:1.3.0--0' locally
  1.3.0--0: Pulling from biocontainers/admixture
  docker: [DEPRECATION NOTICE] Docker Image Format v1 and Docker Image manifest version 2, schema 1 support is disabled by default and will be removed in an upcoming release. Suggest the author of quay.io/biocontainers/admixture:1.3.0--0 to upgrade the image to the OCI Format or Docker Image manifest v2, schema 2.

This is because docker schema 1 is not supported by default in docker engine version 26, and this support will be removed completely in future versions.

After fixing this, I could make it work on the server and the test profile ran successfully on the server. So, I do not know what the problem might be in the local computer. Could be about the RAM, but I have set the parameters the way you suggested and the problem still happening.

Any further suggestions to fix the problem in the local computer?

BioInf2305 commented 6 months ago

Based on this blog on Nextflow website (https://www.nextflow.io/blog/2021/setup-nextflow-on-windows.html), I hypothesize that it could be an issue with docker image of ADMIXTURE tool. It could be that docker of ADMIXTURE tool is not compatible with the windows subsystem Linux (WSL). Try the solutions mentioned in the section "Step 3" of the above-mentioned blog; basically, follow these steps ( I am quoting exactly from the above-mentioned link):

1). Edit the .wslconfig file in your Windows home directory. You can do this using PowerShell as shown:

PS C:\Users\<username> notepad .wslconfig

2). Add these two lines to the .wslconfig file and save it:


[wsl2]
kernelCommandLine = vsyscall=emulate

3). Restart your machine.

Let me know if it works.

Unfortunately, I don't have access to WSL for now so I really cannot test it. One thing that I could do is containerized the ADMIXTURE tool on the new base image and replace it with the biocontainer image in the workflow. This could take sometime but meanwhile, I am glad that you managed to run test example on your server.

eduardo-pizarro commented 6 months ago

I checked the blog and I followed your suggestion. And it worked! Even with the parameter -qs 10 Thanks a lot!

I'm closing the issue since that change solved it.

Cheers

Popgen48 / scalepopgen