nanoporetech / ont-spectre

Other
1 stars 0 forks source link

numpy complaints; when using min-len-cnv 10000 #3

Open fidibidi opened 1 month ago

fidibidi commented 1 month ago

Ran on sample using command:

# ont version code
  spectre CNVCaller \
  --bin-size 1000 \
  --threshhold-quantile 10 \
  --dist-proportion 0.3 \
  --coverage input-files/ \
  --sample-id A0035 \
  --output-dir A0035_ont_spectre_output/ \
  --reference ~/refs/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
  --blacklist grch38_blacklist_0.3 \
  --min-cnv-len 10000 \
  --snv input-files/A0035.wf_snp.vcf.gz \
  --metadata grch38_metadata

Spectre ran, but produced this worrisome message during its run.

spectre::INFO> Number positions to be tested on chromosome chrY: 2290                                                                                                                          [372/580]
spectre::INFO> Number positions to be tested on chromosome chrM: 1
/home/ubuntu/miniconda3/envs/ont-spectre/lib/python3.8/site-packages/numpy/lib/nanfunctions.py:1872: RuntimeWarning: Degrees of freedom <= 0 for slice.
  var = nanvar(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
/home/ubuntu/miniconda3/envs/ont-spectre/lib/python3.8/site-packages/numpy/lib/nanfunctions.py:1217: RuntimeWarning: All-NaN slice encountered
  return function_base._ureduce(a, func=_nanmedian, keepdims=keepdims,
/home/ubuntu/miniconda3/envs/ont-spectre/lib/python3.8/site-packages/spectre/analysis/analysis.py:278: RuntimeWarning: All-NaN slice encountered
  cov_stats.min = np.nanmin(self.coverage)
/home/ubuntu/miniconda3/envs/ont-spectre/lib/python3.8/site-packages/spectre/analysis/analysis.py:279: RuntimeWarning: All-NaN slice encountered
  cov_stats.max = np.nanmax(self.coverage)
/home/ubuntu/miniconda3/envs/ont-spectre/lib/python3.8/site-packages/spectre/util/dataAnalyzer.py:12: RuntimeWarning: All-NaN slice encountered
  min_val = np.nanmin(normalized_candidates)
/home/ubuntu/miniconda3/envs/ont-spectre/lib/python3.8/site-packages/spectre/util/dataAnalyzer.py:13: RuntimeWarning: All-NaN slice encountered
  max_val = np.nanmax(normalized_candidates)
/home/ubuntu/miniconda3/envs/ont-spectre/lib/python3.8/site-packages/spectre/util/dataAnalyzer.py:14: RuntimeWarning: Mean of empty slice
  avg = np.nanmean(normalized_candidates)
spectre::INFO> Number positions to be tested on chromosome chr1_KI270706v1_random: 8
spectre::INFO> Number positions to be tested on chromosome chr1_KI270707v1_random:

In the end, a vcf was generate with no found CNVs. However, when a command (similar but modified with vanilla spectre in mind) is run, the vcf contains CNVs.

spectre CNVCaller \
  --dist-proportion 0.3 \
  --coverage input-files/A0035.regions.bed.gz \
  --sample-id A0035 \
  --output-dir A0035_spectre_output/ \
  --reference ~/refs/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
  --blacklist data/grch38_blacklist_0.3.bed \
  --min-cnv-len 11000 \
  --snfj input-files/A0035.wf_sv.snfj.gz \
  --snv input-files/A0035.wf_snp.vcf.gz \
  --metadata data/grch38_metadata.mdr
oxygen311 commented 4 weeks ago

Hello @fidibidi,

Thank you for raising this issue. The warnings you're seeing are likely due to the small chromosome sizes, such as chrM and unplaced contigs, where the number of positions to be tested is very limited. These warnings generally do not significantly affect the results and can be safely ignored in this context.

Regarding the use of --min-cnv-len set to 10kb, we do not recommend this configuration as it can lead to an increased number of false positives (FPs). For events smaller than 100kb, we suggest using Sniffles, which is more suited for handling such cases.

Please let us know if you have any further questions or concerns.

Best Regards, Alexey