epi2me-labs / wf-somatic-variation

Other
15 stars 8 forks source link

non-matched sample_name, tumour_bam and normal_bam from the html report #24

Closed selmapichot closed 3 months ago

selmapichot commented 5 months ago

Operating System

Other Linux (please specify below)

Other Linux

rocky linux 8.9

Workflow Version

v1.1.0-g5851e21

Workflow Execution

Command line (Cluster)

Other workflow execution

No response

EPI2ME Version

No response

CLI command run

nextflow run epi2me-labs/wf-somatic-variation --bam_normal "/rds/project/sorted_aligned_TB21.06450_normal.bam" --bam_tumor "/rds/project/sorted_aligned_TB21.06450_t1.bam" --ref "/rds/project/reference/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna" --snv --mod --sample_name "somatic_TB21_06450_T1_invert" --normal_min_coverage 5 --tumor_min_coverage 5 --phase_normal --out_dir "/rds/epi2me_workflow/somatic_variation/TB21_06450" -resume -profile singularity

Workflow Execution - CLI Execution Profile

None

What happened?

Hi, I have noticed on the snv and mod html reports that the sample_name, tumour_bam and normal_bam at the end of the HTML reports do not match the ones I used when I ran the command. I am a bit confused as to which samples were really analysed by the workflow... any help would be greatly appreciated. Many thanks

Relevant log output

samples names on the report are : 
bam_normal  /rds/rebasecall_TB23_00177/sorted_dorado_aligned_TB23_00177_tl1_grch38.bam
bam_tumor   /rds/rebasecall_TB23_00177/sorted_dorado_aligned_TB23_00177_tl4_grch38.bam

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

No response

RenzoTale88 commented 5 months ago

Hi @selmapichot do you mind checking the command line used in the /rds/epi2me_workflow/somatic_variation/TB21_06450/execution/report.html?

selmapichot commented 5 months ago

Hi, the command seems to be the correct one on the /rds/epi2me_workflow/somatic_variation/TB21_06450/execution/report.html with the correct sample. So should I assume that it was the correct sample indeed which was analysed ?

RenzoTale88 commented 5 months ago

Let's have a better look into it. Do you still have the work directory produced by the analysis?

selmapichot commented 5 months ago

yes I do.

RenzoTale88 commented 5 months ago

Ok great! Can you please look for the hash value of a task called alignment_stats:makeQCreport in the Task table at the end of /rds/epi2me_workflow/somatic_variation/TB21_06450/execution/report.html. That value will be the beginning of the path in work:

ls work/{ insert hash here}*/

Can you let me know the files in the folder please?

selmapichot commented 5 months ago

(tasks table omitted because the dataset is too big) in the html report ...

RenzoTale88 commented 5 months ago

Ok, then let's try something else. Check the trace.txt file instead:

grep makeQCreport /rds/epi2me_workflow/somatic_variation/TB21_06450/execution/report.html

And retrieve the hash value as described above.

RenzoTale88 commented 5 months ago

@selmapichot any progress with this?

selmapichot commented 5 months ago

Hi, Sorry, couldn't get anything running this command, or maybe I'm not sure what to look for exactly. I have run another sample using the somatic_variation wf 3 days ago and I still have the same issue in the report.

RenzoTale88 commented 5 months ago

Thanks for confirming that the issue is still present. We need to figure out what is going on in the process to be able to understand what is causing this issue. Do you still have the work directory generated by the workflow?

selmapichot commented 5 months ago

Yes, I still have the work directory.

RenzoTale88 commented 5 months ago

Ok that's good. The first thing is check the content of execution/report.html in the output directory. In there, there should be a table with the list of the processes executed. You need to find the row referring to the makeQCreport process, as we will need the hash reported there.

RenzoTale88 commented 5 months ago

@selmapichot alternatively, do you mind sharing the file execution/trace.txt in the output directory?

selmapichot commented 5 months ago

trace.txt

Yes of course, please find it attached.

RenzoTale88 commented 5 months ago

Based on the trace.txt file, you haven't run the reporting script, and therefore you shouldn't have the report at all. I can see that you have a lot of cached processes.

Do you mind updating the workflow to the latest version and try again without resuming the process? I'd suggest to drop the workflow first, then rerun it from clean (without -resume)

nextflow drop epi2me-labs/wf-somatic-variation
nextflow run epi2me-labs/wf-somatic-variation -r v1.2.2 ...

Let me know if the workflow runs the makeQCreport process.

Andrea

RenzoTale88 commented 4 months ago

@selmapichot did you try the mentioned suggestion?

selmapichot commented 4 months ago

Hi Andrea, yes I tried, the job was sitting in the queue for a long time and when it ran, there was an error message related to pulling the singularity image :/ I have just pulled the image manually and re-submitted the job. I will keep you updated when it finishes, hopefully. All the best, S

RenzoTale88 commented 4 months ago

@selmapichot If you are running it in a cluster, you can probably ask help to your IT support to create a nextflow configuration file. This should speed up the workflow considerably. As all clusters are different we cannot offer any support for configuring Nextflow on a specific cluster but you can find help in how to create a configuration here. Or alternatively, you can check out the nf-core configurations and see if they support one for your system, or contact the nf-core maintainers to help you create one.

RenzoTale88 commented 3 months ago

@selmapichot any progress on this?

RenzoTale88 commented 3 months ago

@selmapichot closing this as there has been no updates in a long time. Please, do get in touch again or reopen the ticket if you need further support with this.