Open anh151 opened 10 months ago
@anh151,
Thanks for raising this. Unfortunately, as you can imagine it's very difficult for me to help without seeing the full version that you re-coded. I think the best solution is to request your admin to install the Singularity and then use our StellarPGx official versions. I am not sure this Github is really the best place to debug customised versions as that would be intractable on our end.
David
Hey @twesigomwedavid
This analysis is in All of Us. We are attempting to call CYP2D6 on ~245k samples with several PGx callers to make the calls available to the rest of the researchers using the cohort. I talked to the program's support and data science team about installing singularity, but they couldn't do it because of security reasons. The only options remained were either not use StellarPGx, convert the nextflow pipeline to Cromwell or convert it to Python. Python seemed like the easiest and best option so that's what I went with. Attached is the python script used to run stellarpgx. I have been able to run it successfully on all samples except the 11 with the errors shown above. I have also confirmed that many of the samples have agreeing calls across multiple callers. If you are unable to address these errors then we can just report the 11 samples as Indeterminate. stellarpgx.py.zip
@anh151,
I think the easier solution would have been to install the tools within the StellarPGx workflow separately. That way, no rewrite would have been needed (except disabling Singularity in the nextflow.config file). Is this still possible by any chance? Since you're running the tools using Python, I imagine that all of them are installed on your system already and can be called upon using Nextflow as the current StellarPGx version is set up to do.
Thanks
David
@twesigomwedavid
That is a better idea that I didn't think of. Now implemented your suggestion. I turned off all docker and singularity in the config file. Managed to run it successfully on a sample that should succeed and the call is correct. What information do you need to help you debug? Do you need the entire work
directory for each sample or just the final call_stars
function that is failing?
Versions just incase it's important. I can install other versions of anything, just wasn't sure what was needed. I just used what was preinstalled: bcftools: 1.12 tabix 1.10 graphtyper: 2.7.6 python: 3.10.12
Thanks, Andrew
@anh151, It depends, are you still getting the same errors even after implementing this suggestion? If you are, then the more relevant info would be in the .command.err file in the work directory of the particular sample. Also, be sure to update to the latest StellarPGx version.
Kind regards, David
@twesigomwedavid Thanks for all of the help.
Yes I am still getting the same errors. Attached are the errors for each of the samples. Yes I am using the latest version of StellarPGx. Let me know if you need any other info. errors.zip
Thanks, Andrew
@anh151,
I think it might be best to set up a Zoom call to discuss this rather than going back and forth over Github.
Let's set up a time via direct email. My time zone is GMT+2
David
Hi David, This error is a bit unrelated and I can create a new issue if needed.
In main.nf
, shouldn't call_sv_del
and call_sv_dup
have ${cram_options}
for the graphtyper commands? Without it I get the error below because it's trying to access a different reference genome.
Error executing process > 'call_sv_dup (1)'
Caused by:
Process `call_sv_dup (1)` terminated with an error exit status (1)
Command executed:
graphtyper genotype_sv lib/Homo_sapiens_assembly38.fasta --sam=wgs_1056081.cram --region=chr22:42126000-42137500 --output
=wgs_1056081_sv_dup res_hg38/sv_test3.vcf.gz
Command exit status:
1
Command output:
(empty)
Command error:
[W::find_file_url] Failed to open reference "https://www.ebi.ac.uk/ena/cram/md5/ac37ec46683600f808cdd41eac1d55cd": Protoc
ol not supported
[E::cram_get_ref] Failed to populate reference for id 21
[E::cram_decode_slice] Unable to fetch reference #21 42109689..42137150
[E::cram_next_slice] Failure to decode slice
[2024-01-23 22:24:08.524] <error> hts_reader.cpp:252 htslib failed BAM/CRAM reading of wgs_1056081.cram and returned -2
Work dir:
/home/jupyter/workspaces/pharmacogenomichaplotypecharacterization/bin/StellarPGx/work/ff/2405f069aee4498acf0a6e725024da
Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
Thanks, Andrew
Hello, I've managed to run StellarPGx on thousands of samples. I did have to rewrite the nextflow pipeline in python because of restrictions in my environment. I have a set of 11 samples that are all returning the errors shown below.
Error 1. Occurs in 2 samples:
Error 2. Occurs in 4 samples:
Error 3. Occurs in 3 samples:
Error 4. Occurs in 2 samples:
Thanks, Andrew