mourisl / T1K

T1K is a versatile methods to genotype highly polymorphic genes (e.g. KIR, HLA) with bulk or single-cell RNA-seq, WGS or WES data.
MIT License
52 stars 8 forks source link

Running T1K inside singularity container #37

Closed sihan2k closed 2 months ago

sihan2k commented 2 months ago

Hello I am trying to run this tool inside of a singularity container on an HPC. The container was created in april 2024, and the github repo version therefor matches that date.

I am trying to run this command, but the same error also occurs when running the command with the example data found in the README.md. I have removed the sample names in the command but the paths are still complete.

/T1K/run-t1k -t 20 -1 /scratch/49723881.i-07-moab1.eth.clb/blmsvooqtq/{sample_name}_R1.fastq.gz -2 /scratch/49723881.i-07-moab1.eth.clb/blmsvooqtq/{sample_name}_R2.fastq.gz -f /ngc/shared/T1K/hlaidx/hlaidx_rna_seq.fa -o 58ndconyf-{sample_name}_Fase04427 --preset hla --od /scratch/49723881.i-07-moab1.eth.clb/blmsvooqtq

This is the error thats reported (this was on the test example run): system /T1K/fastq-extractor -t 8 -f kiridx/kiridx_rna_seq.fa -o T1K_example_candidate -1 example/example_1.fq -2 example/example_2.fq failed: 11 at ./run-t1k line 61.

Am curious if anyone has experience running T1K in a singularity container and or have any similar issues like this

Thanks,

Simon

Bondada20 commented 2 months ago

I just ran T1K with the example data, and it worked perfectly. I've used T1K with Docker (haven't tried Singularity), and that also worked well. Did you encounter the error when running T1K locally or on your login node (HPC) without using any container?

mourisl commented 2 months ago

Are you using "--bind" to mount the data on your local server to the singularity running environment?

sihan2k commented 2 months ago

I just ran T1K with the example data, and it worked perfectly. I've used T1K with Docker (haven't tried Singularity), and that also worked well. Did you encounter the error when running T1K locally or on your login node (HPC) without using any container?

I did not encounter the error when running it on the HPC login node, without using any containers. That's how it is run on production at the moment. It only happens inside the container, whether i call singularity exec and run the commands from outside the container, or if i open the shell and use either command while inside.

sihan2k commented 2 months ago

Are you using "--bind" to mount the data on your local server to the singularity running environment?

The singularity command before '/T1K/run-t1k' looks like this: singularity exec --bind /ngc/projects2/gm/people/simonh/RNA_test/ngs_pipeline_workdir/results/variants/hla_t1k/rna --bind /ngc/projects2/gm/people/simonh/RNA_test/ngs_pipeline_workdir/temp/align/bbduk --bind /ngc/shared/T1K/hlaidx --bind /scratch /ngc/projects/gm/external_tools/singularity/t1k/containers/t1k_latest.sif /T1K/run-t1k ... etc.

I am using --bind argument to each folder needed for the singularity container to reach input, output and other necessary files. At first i expected this to be the issue as well, but after some testing and trying the test command with example data inside the container i figure it may be something else.

mourisl commented 2 months ago

What system is the singularity image created from? Is it based on ARM architecture?

sihan2k commented 2 months ago

The image is built with an AMD64 architecture with Linux OS. Ubuntu version 22.04

sihan2k commented 2 months ago

Running any of these commands in older (1yr & 2yr old) singularity containers that contains an older version of the github throws this error instead: /T1K/fastq-extractor: 1: /T1K/fastq-extractor: Syntax error: "(" unexpected system /T1K/fastq-extractor -t 8 -f kiridx/kiridx_rna_seq.fa -o T1K_example_candidate -1 example/example_1.bam -2 example/example_2.fa failed: 512 at ./run-t1k line 58.

Line 58 is the same code as line 61 in the other error from my first message. Newer version of run-t1k has 3 extra lines of comments to explain arguments.

All these containers are docker images converted to Singularity using ''singularity pull 'image' " All the containers are running 22.04 Ubuntu with AMD64 architecture.

mourisl commented 2 months ago

AMD archiecture should be fine. For this test, I guess te issue is -1 and -2 files are wrong? There is no bam and fa for the example file. It's also strange to see the syntax error "(".

sihan2k commented 2 months ago

Yeah i see that test command is wrong. I now changed the files on -1 and -2, to match the files in the example folder. I also noticed there was no kiridx and hlaidx folder inside the singularity container. Therefor i used a local referencefile generated by the perl script -f /ngc/shared/T1K/hlaidx/hlaidx_rna_seq.fa in the test command instead (after using --bind to the folder ofc). The error is still unchanged.

Could the problem be that neither of the singularity containers have run the perl script from step 3 in the installation? perl t1k-build.pl -o hlaidx --download IPD-IMGT/HLA perl t1k-build.pl -o kiridx --download IPD-KIR --partial-intron-noseq

I have a proper reference file for the command, but i assume t1k-build.pl has some other function than creating the ref. files?

sihan2k commented 2 months ago

Hello I have an update on this issue.

After building a docker image from scratch, instead of using an already built one from dockerhub, it now runs with no errors. Both the test command and the real command runs perfectly in both docker and singularity after converting this docker image.

I can't tell you what went wrong in creating the other containers or what else caused the issues.

I'll post the docker image and attach the dockerfile if it is something you want to look into to see if you can find the cause: https://hub.docker.com/r/dimon1209/t1k T1K-dockerfile.txt

Thanks for your help