getzlab / MutSig2CV

MutSig2CV from Lawrence et al. 2014
Other
30 stars 8 forks source link

The params file format #4

Open RupalHatkar opened 3 years ago

RupalHatkar commented 3 years ago

Hello MutSig2CV Authors,

In the instructions for param file it says that the example file is available in test/input/params.txt. but there is no test folder.

I made a params_file.txt file stating remove_duplicate_patients = FALSE to instruct MutSig2CV to noto remove duplicate patients but it's not working and it keeps removing all the patients leaving me with no results files.

Can you please guide me? I am getting the following error:

0 patients WARNING: MutSig is not applicable to single patients. Error using | Matrix dimensions must agree. Error in add_helper_is_fields (line 19) Error in impute_callschemes (line 12) Error in MutSig_2CV_v3_11_core (line 296) Error in MutSig_2CV_v3_11_wrapper (line 50) MATLAB:dimagree

Thank you! Rupal

julianhess commented 3 years ago

Hi Rupal,

Running MutSig on a cohort with a large mutation overlap between samples will yield very poor results. MutSig's background model explicitly assumes that mutations arise independently across samples, which is likely not the case when there is a substantial overlap. For example, multiple serial biopsies from the same patient will invariably share common ancestral events. In that case, violating the independence assumption will generate problematic results since passenger genes containing truncal mutations will appear to be recurrently mutated across many biopsies and show up as significant.

Another common reason for substantial overlaps between samples is due to germline contamination (common germline SNPs appear as recurrent somatic mutations), or poorly filtered/QC'd somatic variant calls that contain recurrent sequencing artifacts across samples. Obviously, these would also yield poor MutSig results.

I would carefully QC your mutation data before attempting to run with this option disabled.

—Julian

RupalHatkar commented 3 years ago

Hello Julian,

Yes, I do have multi-sites sequenced from same patients. Besides QCing mutations data, do you have any recommendations for running mutations from multi-site sequences? Any other tool to detect driver and passenger mutations from your experience?

Thank you! Rupal

From: julianhess @.> Sent: Friday, July 16, 2021 1:38 PM To: getzlab/MutSig2CV @.> Cc: Rupal Hatkar @.>; Author @.> Subject: Re: [getzlab/MutSig2CV] The params file format (#4)

EXTERNAL EMAIL:

Hi Rupal,

Running MutSig on a cohort with a large mutation overlap between samples will yield very poor results. MutSig's background model explicitly assumes that mutations arise independently across samples, which is likely not the case when there is a substantial overlap. For example, multiple serial biopsies from the same patient will invariably share common ancestral events. In that case, violating the independence assumption will generate problematic results since passenger genes containing truncal mutations will appear to be recurrently mutated across many biopsies and show up as significant.

Another common reason for substantial overlaps between samples is due to germline contamination (common germline SNPs appear as recurrent somatic mutations), or poorly filtered/QC'd somatic variant calls that contain recurrent sequencing artifacts across samples. Obviously, these would also yield poor MutSig results.

I would carefully QC your mutation data before attempting to run with this option disabled.

—Julian

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/getzlab/MutSig2CV/issues/4#issuecomment-881609781, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANJPLTZ2TCVBIFXPZYGZAHTTYBU6XANCNFSM5AP622QQ.

liux2250 commented 2 years ago

I got the same issue as Rupal. I am wondering if you can kindly provide some sample maf data for our beginners to run the package. Thant would be very helpful.

Best, Yang

dansteiert commented 1 year ago

It seemed to me that the option is silent in the file src/MutSig_2CV_v3_11_core.m If you want to change this, I added an if clause as shown below and set the default for remove_duplicate_patients to true instead of 1.

 196 % remove duplicate patients
 197 %%LINE ADDED HERE! - IF CASE!
 198 if P.remove_duplicate_patients
 199   fprintf('Scanning for duplicate patients...\n');
 200   X = new_find_duplicate_samples(M.mut);
 201   if ~isempty(X.drop)
 202     fprintf('Removing the following %d duplicate patients:\n',length(X.drop));
 203     disp(X.drop);
 204     M.mut = reorder_struct_exclude(M.mut,ismember(M.mut.patient,X.drop));
 205   end
 206 end

Best!