Closed ashwini06 closed 2 years ago
solution from @karlnyr cg commands to link multiple fastq files to single case-id
[0|0|0] 10d [hiseq.clinical@hasta:~] [P_main] 19s 2 $ cg add family --priority standard -p OMIM-AUTO -a balsamic -dd scout cust000 panel_of_normal_20211222
givingcobra: new case added
[0|0|0] θ60° 10d [hiseq.clinical@hasta:~] [P_main] 22s $ for sample in `awk '$1 !~ /sample_id/ {print $1}' /home/proj/long-term-stage/cancer/PON_analysis_runs_APJ/GMCKsolid_normalblood_custID_VW.txt | uniq`; do cg add relationship -s unknown givingcobra $sample; done
cg workflow balsamic link givingcobra
But sometimes the files will be decompressed the above command retrieves error like
cg workflow balsamic link givingcobra
Case givingcobra exists in status db
Fetching files from bundle ACC6516A1
Fetching files with tags in [fastq]
Concatenation in progress for sample ACC6516A1.
Concatenating: ->
Traceback (most recent call last):
File "/home/proj/bin/conda/envs/P_main/bin/cg", line 8, in <module>
sys.exit(base())
File "/home/proj/bin/conda/envs/P_main/lib/python3.7/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/home/proj/bin/conda/envs/P_main/lib/python3.7/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/home/proj/bin/conda/envs/P_main/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/proj/bin/conda/envs/P_main/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/proj/bin/conda/envs/P_main/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/proj/bin/conda/envs/P_main/lib/python3.7/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/proj/bin/conda/envs/P_main/lib/python3.7/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/home/proj/bin/conda/envs/P_main/lib/python3.7/site-packages/click/decorators.py", line 27, in new_func
return f(get_current_context().obj, *args, **kwargs)
File "/home/proj/bin/conda/envs/P_main/lib/python3.7/site-packages/cg/cli/workflow/commands.py", line 70, in link
analysis_api.link_fastq_files(case_id=case_id)
File "/home/proj/bin/conda/envs/P_main/lib/python3.7/site-packages/cg/meta/workflow/balsamic.py", line 129, in link_fastq_files
case_obj=case_obj, sample_obj=link.sample, concatenate=True
File "/home/proj/bin/conda/envs/P_main/lib/python3.7/site-packages/cg/meta/workflow/analysis.py", line 304, in link_fastq_files_for_sample
self.fastq_handler.concatenate(linked_reads_paths[read], concatenated_paths[read])
File "/home/proj/bin/conda/envs/P_main/lib/python3.7/site-packages/cg/meta/workflow/fastq.py", line 37, in concatenate
with open(concat_file, "wb") as write_file_obj:
FileNotFoundError: [Errno 2] No such file or directory: ''
To decompress and link files run
cg workflow balsamic start givingcobra -r
This command won't run the balsamic analysis as there are more than one normal sample and returns the following error
ACC8839A2 normal tgs gmcksolid_4.1_hg19_design.bed
Could not create config: Invalid number of normal samples: 26, only up to 1 allowed!!
Aborted!
But now we have all fastq files linked to single case'iD /home/proj/production/cancer/cases/givingcobra/fastq
Closing since there is nothing else that has to be done on Balsamic side. Will be addressed in CG: https://github.com/Clinical-Genomics/cg/issues/1522.
Is your feature request related to a problem? Please describe. The idea is to generate a PON reference using all the normal blood samples from the GMCKsolid panel that was sequenced earlier here at clinical genomics.
Describe the solution you'd like With the search criteria by filtering samples based on target panel 'GMCKsolid' and cases with Tumor-Normal samples [(irrespective of customer IDs)], and those samples where normal samples are of blood origin. This filter search gave us ~70 samples. Out of ~70 samples, 53 samples can be used for creating GMCKsolid PON , and after removing duplicate sample ID 26 samples can be used for PON generation
GMCKsolid_normalblood_custID_VW.xlsx
Expected output for the feature Using PON workflow generate a PON reference :
GMCKsolid_caseid_PON_reference.cnn