Open RuanSpies21 opened 2 months ago
Hi @RuanSpies21 ,
Happy to work on this together, could you please share 5 sample IDs from your dataset?
This way I can test those locally.
Thanks @abhi18av!
ERR038276 ERR038277 ERR038278 ERR038279 ERR038280
Hi @abhi18av - any thoughts on this yet?
Hi @RuanSpies21 ,
Apologies for the late response on this one, I has been able to reproduce this error on my side using the pipeline's default -k 100
for BWA, which completed in 30 seconds per sample.
This was NOT resolved even when I enabled bwa_k66
on my side with these samples, raising the runtime for BWA to roughly 40 seconds per sample.
The following statistics were generated for the individual files
|SAMPLE |AVG_INSERT_SIZE|MAPPED_PERCENTAGE|RAW_TOTAL_SEQS|AVERAGE_BASE_QUALITY|MEAN_COVERAGE|SD_COVERAGE|MEDIAN_COVERAGE|MAD_COVERAGE|PCT_EXC_ADAPTER|PCT_EXC_MAPQ|PCT_EXC_DUPE|PCT_EXC_UNPAIRED|PCT_EXC_BASEQ|PCT_EXC_OVERLAP|PCT_EXC_CAPPED|PCT_EXC_TOTAL|PCT_1X |PCT_5X |PCT_10X |PCT_30X |PCT_50X |PCT_100X|MAPPED_NTM_FRACTION_16S|MAPPED_NTM_FRACTION_16S_THRESHOLD_MET|COVERAGE_THRESHOLD_MET|BREADTH_OF_COVERAGE_THRESHOLD_MET|ALL_THRESHOLDS_MET|
|-------------------------|---------------|-----------------|--------------|--------------------|-------------|-----------|---------------|------------|---------------|------------|------------|----------------|-------------|---------------|--------------|-------------|--------|--------|--------|--------|--------|--------|-----------------------|-------------------------------------|----------------------|---------------------------------|------------------|
|MAGMA.ERX015472_ERR038276|366.5 |73.97 |17112670 |34.5 |154.672354 |71.416821 |157 |48 |0 |0.09915 |0.154413 |0 |0.02558 |0.001099 |0 |0.280241 |0.972886|0.96603 |0.961128|0.942092|0.916305|0.78371 |0.0 |1 |1 |1 |1 |
|MAGMA.ERX015473_ERR038277|384.5 |77.16 |18091946 |35.3 |176.428354 |71.317561 |185 |46 |0 |0.086099 |0.148593 |0 |0.020496 |0.000459 |0 |0.255647 |0.973516|0.966587|0.963094|0.950857|0.935023|0.859417|0.0 |1 |1 |1 |1 |
|MAGMA.ERX015474_ERR038278|407.9 |77.47 |13464688 |35.3 |134.847332 |58.306765 |142 |38 |0 |0.084936 |0.132313 |0 |0.020973 |0.000473 |0 |0.238694 |0.966827|0.959598|0.954376|0.933156|0.903887|0.750069|0.0 |1 |1 |1 |1 |
|MAGMA.ERX015475_ERR038279|427.5 |76.3 |16200744 |35.2 |155.460953 |61.334029 |165 |38 |0 |0.09051 |0.147023 |0 |0.021057 |0.000692 |0 |0.259282 |0.97162 |0.964273|0.960278|0.946518|0.928075|0.832158|0.0 |1 |1 |1 |1 |
|MAGMA.ERX015476_ERR038280|478.8 |75.39 |18525588 |35.2 |171.901534 |69.736791 |180 |42 |0 |0.096019 |0.158024 |0 |0.020307 |0.000607 |0 |0.274956 |0.973743|0.967281|0.96324 |0.949922|0.934053|0.859584|0.0 |1 |1 |1 |1 |
And I was able to reproduce the issue related to type casting in python script
INFO: Converting SIF file to temporary sandbox...
Traceback (most recent call last):
File "/home/abhinav/.nextflow/assets/TORCH-Consortium/MAGMA/bin/generate_merged_cohort_stats.py", line 55, in <module>
df_final_cohort_stats['ALL_THRESHOLDS_MET'] = df_final_cohort_stats['MAPPED_NTM_FRACTION_16S_THRESHOLD_MET'].astype('bool') & df_final_cohort_stats['COVERAGE_THRESHOLD_MET'].astype('bool') & df_final_cohort_stats['BREADTH_OF_COVERAGE_THRESHOLD_MET'].astype('bool') & df_final_cohort_stats['RELABUNDANCE_THRESHOLD_MET'].astype('bool')
File "/opt/conda/lib/python3.10/site-packages/pandas/core/generic.py", line 6240, in astype
new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
File "/opt/conda/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 448, in astype
return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
File "/opt/conda/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 352, in apply
applied = getattr(b, f)(**kwargs)
File "/opt/conda/lib/python3.10/site-packages/pandas/core/internals/blocks.py", line 526, in astype
new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
File "/opt/conda/lib/python3.10/site-packages/pandas/core/dtypes/astype.py", line 299, in astype_array_safe
new_values = astype_array(values, dtype, copy=copy)
File "/opt/conda/lib/python3.10/site-packages/pandas/core/dtypes/astype.py", line 227, in astype_array
values = values.astype(dtype, copy=copy)
File "/opt/conda/lib/python3.10/site-packages/pandas/core/arrays/masked.py", line 474, in astype
raise ValueError("cannot convert float NaN to bool")
ValueError: cannot convert float NaN to bool
INFO: Cleaning up image...
I am currently working on a patch to address this issue - thank you for bringing it to my attention!
@RuanSpies21 , could you please try running the pipeline with the following command? I have pushed a patch to master branch now.
NOTE: Please replace whatever makes sense in your context, but the main snippet is -r master -latest -resume
nextflow run 'https://github.com/TORCH-Consortium/MAGMA'
-profile singularity,bwa_k66
-r master
-latest
-resume
-params-file params.magma.yaml
Thank you so much for the help @abhi18av! I'm so sorry, I am not quite getting it right :(
When I run nextflow run 'https://github.com/TORCH-Consortium/MAGMA' -profile docker,server,bwa_k66 -params-file params.yaml -r master -latest -resume
I get: Unknown configuration profile: 'bwa_k66'
If I then add the -c custom.config with the file mentioned above I get ERROR ~ Unknown method invocation
splitJsonon UnixPath type
Seems to be an issue with sample sheet validation? Here is the format of my sample sheet for reference:
Sample,R1,R2
ERR025842,/mnt/volume_data/ruan/walker_2013/ERR025842_1.fastq.gz,/mnt/volume_data/ruan/walker_2013/ERR025842_2.fastq.gz
ERR025843,/mnt/volume_data/ruan/walker_2013/ERR025843_1.fastq.gz,/mnt/volume_data/ruan/walker_2013/ERR025843_2.fastq.gz
I've also attached the nextflow logs in case helpful.
Thanks again for your help - very sorry to keep bothering! nextflow.log
Hi @RuanSpies21
The samplesheet sheet looks fine to me, but let's make sure that the basics are all set
nextflow run 'https://github.com/TORCH-Consortium/MAGMA' -profile docker,server,test -r hotfix/bwa_k66
This should make use of the test
profile and download some samples from original MAGMA publication and run them through.
bwa_k66
profileI have created a new bwa_k66
profile, which you can use without providing a -c custom.config
file.
nextflow run 'https://github.com/TORCH-Consortium/MAGMA' -profile docker,server -r hotfix/bwa_k66 --input_samplesheet /path/to/your/samplesheet.csv
Seems to be an issue with sample sheet validation?
Actually, to me the samplesheet seems valid 🤔
Thanks again for your help - very sorry to keep bothering!
No worries at all Ruan, this is very helpful. There's no perfect software, but with user feedback and usage, we can keep improving it.
I do thank you for your patience!
If this doesn't work, then perhaps we can meet sometime next week? Here's my academic email abhinavsharma at sun dot ac dot za
📆
Ok its looks like its failing with the same error on the test profile as well.
I ran nextflow run 'https://github.com/TORCH-Consortium/MAGMA' -profile docker,server,test -r hotfix/bwa_k66
Output:
process > SAMPLESHEET_VALIDATION [ 0%] 0 of 1
[- ] process > VALIDATE_FASTQS_WF:FASTQ_VALIDATOR -
[- ] process > VALIDATE_FASTQS_WF:UTILS_FASTQ_COHORT_VALIDATION -
[- ] process > QUALITY_CHECK_WF:FASTQC -
[- ] process > QUALITY_CHECK_WF:NTMPROFILER_PROFILE -
[- ] process > QUALITY_CHECK_WF:NTMPROFILER_COLLATE -
[- ] process > MAP_WF:BWA_MEM -
[- ] process > CALL_WF:SAMTOOLS_MERGE -
[- ] process > CALL_WF:GATK_MARK_DUPLICATES -
[- ] process > CALL_WF:SAMTOOLS_INDEX -
[- ] process > CALL_WF:GATK_HAPLOTYPE_CALLER -
[- ] process > CALL_WF:LOFREQ_CALL__NTM -
[- ] process > CALL_WF:LOFREQ_INDELQUAL -
[- ] process > CALL_WF:SAMTOOLS_INDEX__LOFREQ -
[- ] process > CALL_WF:LOFREQ_CALL -
[- ] process > CALL_WF:LOFREQ_FILTER -
[- ] process > CALL_WF:UTILS_REFORMAT_LOFREQ -
[- ] process > CALL_WF:BGZIP__LOFREQ -
[- ] process > CALL_WF:GATK_INDEX_FEATURE_FILE__LOFREQ -
[- ] process > CALL_WF:SAMTOOLS_STATS -
[- ] process > CALL_WF:GATK_COLLECT_WGS_METRICS -
[- ] process > CALL_WF:GATK_FLAG_STAT -
[- ] process > CALL_WF:UTILS_SAMPLE_STATS -
[- ] process > CALL_WF:UTILS_COHORT_STATS -
[- ] process > MINOR_VARIANTS_ANALYSIS_WF:BCFTOOLS_MERGE__LOFREQ -
[- ] process > MINOR_VARIANTS_ANALYSIS_WF:TBPROFILER_VCF_PROFILE__LOFREQ -
[- ] process > MINOR_VARIANTS_ANALYSIS_WF:TBPROFILER_COLLATE__LOFREQ -
[- ] process > MINOR_VARIANTS_ANALYSIS_WF:UTILS_MULTIPLE_INFECTION_FILTER -
[- ] process > UTILS_MERGE_COHORT_STATS -
[- ] process > STRUCTURAL_VARIANTS_ANALYSIS_WF:BWA_MEM__DELLY -
[- ] process > STRUCTURAL_VARIANTS_ANALYSIS_WF:SAMTOOLS_MERGE__DELLY -
[- ] process > STRUCTURAL_VARIANTS_ANALYSIS_WF:GATK_MARK_DUPLICATES__DELLY -
[- ] process > STRUCTURAL_VARIANTS_ANALYSIS_WF:SAMTOOLS_INDEX__DELLY -
[- ] process > STRUCTURAL_VARIANTS_ANALYSIS_WF:DELLY_CALL -
[- ] process > STRUCTURAL_VARIANTS_ANALYSIS_WF:BCFTOOLS_VIEW__DELLY -
[- ] process > STRUCTURAL_VARIANTS_ANALYSIS_WF:BCFTOOLS_MERGE__DELLY -
[- ] process > STRUCTURAL_VARIANTS_ANALYSIS_WF:TBPROFILER_VCF_PROFILE__DELLY -
[- ] process > STRUCTURAL_VARIANTS_ANALYSIS_WF:TBPROFILER_COLLATE__DELLY -
[- ] process > MERGE_WF:PREPARE_COHORT_VCF:GATK_COMBINE_GVCFS -
[- ] process > MERGE_WF:PREPARE_COHORT_VCF:GATK_GENOTYPE_GVCFS -
[- ] process > MERGE_WF:PREPARE_COHORT_VCF:SNPEFF -
[- ] process > MERGE_WF:PREPARE_COHORT_VCF:BGZIP -
[- ] process > MERGE_WF:PREPARE_COHORT_VCF:GATK_INDEX_FEATURE_FILE__COHORT -
[- ] process > MERGE_WF:SNP_ANALYSIS:GATK_SELECT_VARIANTS__SNP -
[- ] process > MERGE_WF:SNP_ANALYSIS:OPTIMIZE_VARIANT_RECALIBRATION:GATK_VARIANT_RECALIBRATOR__ANN7 -
[- ] process > MERGE_WF:SNP_ANALYSIS:OPTIMIZE_VARIANT_RECALIBRATION:UTILS_ELIMINATE_ANNOTATION__ANN7 -
[- ] process > MERGE_WF:SNP_ANALYSIS:OPTIMIZE_VARIANT_RECALIBRATION:GATK_VARIANT_RECALIBRATOR__ANN6 -
[- ] process > MERGE_WF:SNP_ANALYSIS:OPTIMIZE_VARIANT_RECALIBRATION:UTILS_ELIMINATE_ANNOTATION__ANN6 -
[- ] process > MERGE_WF:SNP_ANALYSIS:OPTIMIZE_VARIANT_RECALIBRATION:GATK_VARIANT_RECALIBRATOR__ANN5 -
[- ] process > MERGE_WF:SNP_ANALYSIS:OPTIMIZE_VARIANT_RECALIBRATION:UTILS_ELIMINATE_ANNOTATION__ANN5 -
[- ] process > MERGE_WF:SNP_ANALYSIS:OPTIMIZE_VARIANT_RECALIBRATION:GATK_VARIANT_RECALIBRATOR__ANN4 -
[- ] process > MERGE_WF:SNP_ANALYSIS:OPTIMIZE_VARIANT_RECALIBRATION:UTILS_ELIMINATE_ANNOTATION__ANN4 -
[- ] process > MERGE_WF:SNP_ANALYSIS:OPTIMIZE_VARIANT_RECALIBRATION:GATK_VARIANT_RECALIBRATOR__ANN3 -
[- ] process > MERGE_WF:SNP_ANALYSIS:OPTIMIZE_VARIANT_RECALIBRATION:UTILS_ELIMINATE_ANNOTATION__ANN3 -
[- ] process > MERGE_WF:SNP_ANALYSIS:OPTIMIZE_VARIANT_RECALIBRATION:GATK_VARIANT_RECALIBRATOR__ANN2 -
[- ] process > MERGE_WF:SNP_ANALYSIS:OPTIMIZE_VARIANT_RECALIBRATION:UTILS_ELIMINATE_ANNOTATION__ANN2 -
[- ] process > MERGE_WF:SNP_ANALYSIS:OPTIMIZE_VARIANT_RECALIBRATION:UTILS_SELECT_BEST_ANNOTATIONS -
[- ] process > MERGE_WF:SNP_ANALYSIS:GATK_APPLY_VQSR__SNP -
[- ] process > MERGE_WF:SNP_ANALYSIS:GATK_SELECT_VARIANTS__EXCLUSION__SNP -
[- ] process > MERGE_WF:INDEL_ANALYSIS:GATK_SELECT_VARIANTS__INDEL -
[- ] process > MERGE_WF:GATK_MERGE_VCFS__INC -
[- ] process > MERGE_WF:MAJOR_VARIANT_ANALYSIS:TBPROFILER_VCF_PROFILE__COHORT -
[- ] process > MERGE_WF:MAJOR_VARIANT_ANALYSIS:TBPROFILER_COLLATE__COHORT -
[- ] process > MERGE_WF:PHYLOGENY_ANALYSIS__EXCOMPLEX:GATK_SELECT_VARIANTS__PHYLOGENY -
[- ] process > MERGE_WF:PHYLOGENY_ANALYSIS__EXCOMPLEX:GATK_VARIANTS_TO_TABLE -
[- ] process > MERGE_WF:PHYLOGENY_ANALYSIS__EXCOMPLEX:SNPSITES -
[- ] process > MERGE_WF:PHYLOGENY_ANALYSIS__EXCOMPLEX:SNPDISTS -
[- ] process > MERGE_WF:PHYLOGENY_ANALYSIS__EXCOMPLEX:IQTREE -
[- ] process > MERGE_WF:CLUSTER_ANALYSIS__EXCOMPLEX:CLUSTERPICKER__5SNP -
[- ] process > MERGE_WF:CLUSTER_ANALYSIS__EXCOMPLEX:CLUSTERPICKER__12SNP -
[- ] process > MERGE_WF:PHYLOGENY_ANALYSIS__INCCOMPLEX:GATK_SELECT_VARIANTS__PHYLOGENY -
[- ] process > MERGE_WF:PHYLOGENY_ANALYSIS__INCCOMPLEX:GATK_VARIANTS_TO_TABLE -
[- ] process > MERGE_WF:PHYLOGENY_ANALYSIS__INCCOMPLEX:SNPSITES -
[- ] process > MERGE_WF:PHYLOGENY_ANALYSIS__INCCOMPLEX:SNPDISTS -
[- ] process > MERGE_WF:PHYLOGENY_ANALYSIS__INCCOMPLEX:IQTREE -
[- ] process > MERGE_WF:CLUSTER_ANALYSIS__INCCOMPLEX:CLUSTERPICKER__5SNP -
[- ] process > MERGE_WF:CLUSTER_ANALYSIS__INCCOMPLEX:CLUSTERPICKER__12SNP -
[- ] process > REPORTS_WF:MULTIQC -
[- ] process > REPORTS_WF:UTILS_SUMMARIZE_RESISTANCE_RESULTS -
[- ] process > REPORTS_WF:UTILS_SUMMARIZE_RESISTANCE_RESULTS_MIXED_INFECTION -
WARN: There's no process matching config selector: VALIDATE_FASTQS_WF:SAMPLESHEET_VALIDATION
ERROR ~ Unknown method invocation `splitJson` on UnixPath type
-- Check '.nextflow.log' file for details
Then, I think the problem might be with you Java setup, could you please confirm you're using an LTS version as mentioned here https://github.com/TORCH-Consortium/MAGMA?tab=readme-ov-file#nextflow ?
I can confirm I'm using a LTS version of Java 17.
I don't seem to get the same error when using the alpha pre-release of v2.0.0 nextflow run 'https://github.com/TORCH-Consortium/MAGMA' -profile docker,server -r v2.0.0-alpha -params-file params.yaml
In this case the pipeline runs successfully through the samplesheet validation step
Mmm, then the next suspect is the version of Nextflow, which I think should fix the problem
Could you please test with the following command? 🙏
NXF_VER=24.04.4 nextflow run 'https://github.com/TORCH-Consortium/MAGMA' -profile docker,server,test -r hotfix/bwa_k66
If this works, then I will set the minimum nextflow version to 24.04.x
in the pipeline and you should upgrade by typing nextflow -self-update
Ok great! Test seems to have worked. Thanks for the help. Will give it a bash with these old sequences now - holding thumbs, will let you know how it goes.
nextflow run 'https://github.com/TORCH-Consortium/MAGMA' -profile docker,server,test -r hotfix/bwa_k66
Just getting loads of fails for VALIDATE_FASTQS_WF:FASTQ_VALIDATOR [100%] 14 of 14, failed: 12, retries: 8 ✔ - from test profile. So only 1 of the 3 samples is actually processed [100%] 830 of 830, failed: 738, retries: 492 ✔ - from my sequences. Only 46/169 samples processed
Good so we're past the setup issues.
[100%] 14 of 14, failed: 12, retries: 8 ✔ - from test profile. So only 1 of the 3 samples is actually processed
I wouldn't worry too much about the samples from test
since often while downloading samples from NCBI (FTP) they get corrupted in transit if the network or disk performance is not good.
[100%] 830 of 830, failed: 738, retries: 492 ✔ - from my sequences. Only 46/169 samples processed
So it seems that these samples are likely to be either corrupted while downloading or moving across external disks/computers.
⚠️ That is the reason why we ended up adding a separate VALIDATE_FASTQS_WF:FASTQ_VALIDATOR
process.
One file which you might want to inspect is the QC_statistics/cohort/fastq_validation/magma_analysis.json
file which should gather information about the files such as md5sum
and size
along with stats generated by seqkit
etc. Perhaps that might be useful in debugging the failing samples.
I'd recommend you download your samples from NCBI/ENA using nf-core/fetchngs
pipeline https://nf-co.re/fetchngs/1.12.0/docs/usage/ which makes sure the samples are not corrupted.
Thanks for this @abhi18av. Its a long journey we have been on together now 😂. It seems the pipeline really does not like these old files.
I re-downloaded some of them with nf-core/fetchngs
but large amounts of fails persist at VALIDATE_FASTQS_WF:FASTQ_VALIDATOR
.
Further, those that do pass have 0 coverage.
magma_analysis.json
and joint.merged_cohort_stats.tsv
attached for interest.
As a sanity check, a batch of newer fastqs processed successfully so set up is fine. [magma_analysis.json] (https://github.com/user-attachments/files/17054048/magma_analysis.json) joint.merged_cohort_stats.txt
Hi @RuanSpies21
It seems the pipeline really does not like these old files.
Actually, I would need more evidence to believe that - since we've been using MAGMA to analyse all Brazilian and South African sequences from SRA, produced in last 20 years, and unless there's something wrong with the samples themselves they get through.
That is the reason, why we added the JSON file so that we can have a better overview of the samples which failed. Could you please share that JSON QC_statistics/cohort/fastq_validation/magma_analysis.json
file with me?
Further, those that do pass have 0 coverage.
Indeed, the results here are very suspicious, I will try to run these samples on my end to see if they are atleast reproducible
SAMPLE | AVG_INSERT_SIZE | MAPPED_PERCENTAGE | RAW_TOTAL_SEQS | AVERAGE_BASE_QUALITY | MEAN_COVERAGE | SD_COVERAGE | MEDIAN_COVERAGE | MAD_COVERAGE | PCT_EXC_ADAPTER | PCT_EXC_MAPQ | PCT_EXC_DUPE | PCT_EXC_UNPAIRED | PCT_EXC_BASEQ | PCT_EXC_OVERLAP | PCT_EXC_CAPPED | PCT_EXC_TOTAL | PCT_1X | PCT_5X | PCT_10X | PCT_30X | PCT_50X | PCT_100X | LINEAGES | FREQUENCIES | MAPPED_NTM_FRACTION_16S | MAPPED_NTM_FRACTION_16S_THRESHOLD_MET | COVERAGE_THRESHOLD_MET | BREADTH_OF_COVERAGE_THRESHOLD_MET | RELABUNDANCE_THRESHOLD_MET | ALL_THRESHOLDS_MET |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MAGMA.ERX023849_ERR046787 | 0.0 | 0.0 | 7272484.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 0 | 0 | 0 | 0 | ||
MAGMA.ERX023851_ERR046789 | 0.0 | 0.0 | 3365516.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 0 | 0 | 0 | 0 | ||
MAGMA.ERX023852_ERR046790 | 0.0 | 0.0 | 7425878.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 0 | 0 | 0 | 0 | ||
MAGMA.ERX023853_ERR046791 | 0.0 | 0.0 | 6207574.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 0 | 0 | 0 | 0 | ||
MAGMA.ERX023885_ERR046823 | 0.0 | 0.0 | 6324052.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 0 | 0 | 0 | 0 | ||
MAGMA.ERX023913_ERR046851 | 0.0 | 0.0 | 6399012.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 0 | 0 | 0 | 0 | ||
MAGMA.ERX023975_ERR046913 | 0.0 | 0.0 | 6674844.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 0 | 0 | 0 | 0 | ||
MAGMA.ERX024002_ERR046940 | 0.0 | 0.0 | 6617920.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 0 | 0 | 0 | 0 | ||
MAGMA.ERX024012_ERR046950 | 0.0 | 0.0 | 6311164.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 0 | 0 | 0 | 0 | ||
MAGMA.ERX049831_ERR072065 | 0.0 | 0.0 | 3248118.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 0 | 0 | 0 | 0 | ||
MAGMA.ERX049843_ERR072077 | 0.0 | 0.0 | 3311862.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 0 | 0 | 0 | 0 | ||
MAGMA.ERX049846_ERR072080 | 0.0 | 0.0 | 2881802.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 0 | 0 | 0 | 0 |
Ah ok I see.
Here is the QC_statistics/cohort/fastq_validation/magma_analysis.json
file
magma_analysis.json
Hi @RuanSpies21 , just letting you know that I'm still tracking this, just running across some resource contraints these days on our shared server.
No worries @abhi18av! Thank you so much - have already been so accommodating
Hi there,
I am trying to run the pipeline on some older fastq files (circa 2010s) using the docker profile. The reads for the files are relatively short at ~75bp. Following previous advice from Abhinav, I have created a custom.config file with contents:
which I specify with the -c argument. So my full command is:
nextflow run . -params-file params/params.yaml -profile docker,server,bwa_k66 -c custom.config
.However I get this following error:
I think these is due to the sample returning with 0 coverage (when i check /mnt/volume_data/ruan/walker_2013/MAGMA/magma-results/QC_statistics/per_sample/coverage all have 0)
Any ideas what could be going on here or any workarounds? ERR038264 is an example fastq
Thanks! Ruan