brentp / somalier

fast sample-swap and relatedness checks on BAMs/CRAMs/VCFs/GVCFs... "like damn that is one smart wine guy"
MIT License
254 stars 35 forks source link

show warning when .somalier files have no sites with depth > 0 #42

Open taytayp opened 4 years ago

taytayp commented 4 years ago

I am trying to use somalier to confirm matches between tumor and normal samples from the same patient. somalier extract works fine for both .bam files using sites.hg19.vcf.

The trouble is with relate, which I can't seem to figure out the parameters for. I have tried:

With some simple group and pedigree files, but I only output .tsv files with empty rows.

This feels like it should be a simple use-case, but it is fairly befuddling. Any pointers?


cat group.txt
normal0,tumor0

cat pedigree.txt
fam normal0 0   0   0   0
fam tumor0   0  0   0   0
brentp commented 4 years ago

can you show the stdout and stderr when you run:

somalier relate -o test -g group.txt cohort/*somalier
head test.samples.tsv
taytayp commented 4 years ago

Sure thing.

/somalier # somalier relate -o test -g group.txt cohort/*somalier
somalier version: 0.2.6
[somalier] time to read files and get per-sample stats for 2 samples: 0.00
[somalier] time to get expected relatedness from pedigree graph: 0.00
[somalier] time to calculate all vs all relatedness for all 1 combinations: 0.00
[somalier] wrote interactive HTML output for 1 pairs to: test.html
[somalier] wrote groups to: test.groups.tsv
[somalier] wrote samples to: test.samples.tsv
[somalier] wrote pair-wise relatedness metrics to: test.pairs.tsv
/somalier # head test.samples.tsv 
#sample pedigree_sex    gt_depth_mean   gt_depth_sd depth_mean  depth_sd    ab_mean ab_std  n_hom_ref   n_het   n_hom_alt   n_unknown   p_middling_ab   X_depth_mean    X_n X_hom_ref   X_het   X_hom_alt   Y_depth_mean    Y_n
TP19-09N_N      0.0 -nan    0.0 0.0 0.00    -nan    0   0   0   17384   0.000   0.00    0   0   0   0   0.00    0
TP19-09T_T      0.0 -nan    0.0 0.0 0.00    -nan    0   0   0   17384   0.000   0.00    0   0   0   0   0.00    0
brentp commented 4 years ago

Looks like you don't have any data in the .somalier files can you re-extract the samples and show the output? perhaps you extracted with GRCh37 sites and your data is in 38? or vice-versa?

taytayp commented 4 years ago

Ah, looks like that did it! It's right that the graphs are relatively unimpressive right? There's just one comparison point between two samples, I assume.

#sample_a   sample_b    relatedness hom_concordance hets_a  hets_b  shared_hets hom_alts_a  hom_alts_b  shared_hom_alts ibs0    ibs2    n   x_ibs0  x_ibs2  expected_relatedness
TP19-09N_N  TP19-09T_T  0.999   1.000   5616    6333    5609    4791    5405    4789    0   14229   14235   0   282 -1.0

Thanks for the help, maybe a good error or warning would be to indicate the *somalier inputs are empty? Might save some users like me a headache.