a-slide / pycoQC

pycoQC computes metrics and generates Interactive QC plots from the sequencing summary report generated by Oxford Nanopore technologies basecaller (Albacore/Guppy)
https://a-slide.github.io/pycoQC/
GNU General Public License v3.0
271 stars 41 forks source link

How can I get Bam -alignment reports if I am not using human genome? #113

Closed hrpelg closed 4 years ago

hrpelg commented 4 years ago

Hi,

I am using pycoQc adding a bam file to get my report of aligned reads but because I guess I am not using the human genome I am not getting those plots. Am I right? is there a way I could do this with pycoQC? are you plannig to implement this in it?

Cheers,

E

a-slide commented 4 years ago

Hi @hrpelg, pycoQC doesn't care whether it is the human genome or not. Could you run pycoQC in verbose mode (--verbose) and copy the output so I can understand what is going on?

Thanks Ad

hrpelg commented 4 years ago

Checking arguments values General info package_name: pycoQC package_version: 2.3.1.6 timestamp: 2020-03-06 13:24:34.108545

Runtime options quiet: False verbose: True json_outfile: template_file: config_file: report_title: html_outfile: /workspace/hrpelg/Red_Flesh_ON/Comparison_Canu_corrected_reads_albacore2_Guppy/test sample: 100000 min_pass_qual: 7 min_barcode_percent: 0.1 filter_calibration: False runid_list: [] bam_file: ['/workspace/hrpelg/Red_Flesh_ON/03.minimap2/Redflesh.bam'] barcode_file: [] summary_file: ['/workspace/hrpelg/Red_Flesh_ON/albacore2/Red_flesh_ON_run1_Cas9/all_summary_run1.txt']

Initialising parser Sequencing summary files found: /workspace/hrpelg/Red_Flesh_ON/albacore2/Red_flesh_ON_run1_Cas9/all_summary_run1.txt Barcode summary files found: Bam files found: /workspace/hrpelg/Red_Flesh_ON/03.minimap2/Redflesh.bam Parsing input files Importing sequencing information from sequencing summary files Verifying fields and discarding unused columns 1D Run type Columns found: ['read_id', 'run_id', 'channel', 'start_time', 'sequence_length_template', 'mean_qscore_template', 'calibration_strand_genome_template'] 7,056 reads found in initial file Discarding lines containing NA values 0 reads discarded Filtering out zero length reads 151 reads discarded Sorting run IDs by decreasing throughput Run-id order ['381a627fa445f6ab43dcbb0542d4cc06427d04b3'] Reordering runids Processing reads with Run_ID 381a627fa445f6ab43dcbb0542d4cc06427d04b3 / time offset: 0 Reindexing dataframe by read_ids 6,905 Final valid reads Parser stats Summary files found: 1 Barcode files found: 0 Bam files found: 1 Initial reads: 7056 Reads with NA values discarded: 0 Zero length reads discarded: 151 Valid reads: 6905

Loading plotting interface Plotter stats Barcode: False Alignment: False Promethion: False All reads: 6,905 All bases: 112,578,716 All median read length: 10,165.0 Pass reads: 6,082 Pass bases: 106,589,486 Pass median read length: 11,359.0

Parsing html config file
Read default configuration file

{'summary': {'plot_title': 'Run summary'}, 'barcode_summary': {'plot_title': 'Run summary by barcode'}, 'run_id_summary': {'plot_title': 'Run summary by Run ID'}, 'reads_len_1D': {'plot_title': 'Distribution of read length', 'color': 'lightsteelblue', 'nbins': 200, 'smooth_sigma': 2}, 'reads_qual_1D': {'plot_title': 'Distribution of read quality', 'color': 'salmon', 'nbins': 200, 'smooth_sigma': 2}, 'reads_len_qual_2D': {'plot_title': 'Mean read quality per sequence length', 'colorscale': [[0.0, 'rgba(255,255,255,0)'], [0.1, 'rgba(255,150,0,0)'], [0.25, 'rgb(255,100,0)'], [0.5, 'rgb(200,0,0)'], [0.75, 'rgb(120,0,0)'], [1.0, 'rgb(70,0,0)']], 'len_nbins': 200, 'qual_nbins': 75, 'smooth_sigma': 2}, 'output_over_time': {'plot_title': 'Output over experiment time', 'cumulative_color': 'rgb(204,226,255)', 'interval_color': 'rgb(102,168,255)'}, 'len_over_time': {'plot_title': 'Read length over time', 'median_color': 'rgb(102,168,255)', 'quartile_color': 'rgb(153,197,255)', 'extreme_color': 'rgba(153,197,255,0.5)', 'smooth_sigma': 1}, 'qual_over_time': {'plot_title': 'Mean read quality over time', 'median_color': 'rgb(250,128,114)', 'quartile_color': 'rgb(250,170,160)', 'extreme_color': 'rgba(250,170,160,0.5)', 'smooth_sigma': 1}, 'barcode_counts': {'plot_title': 'Number of reads per barcode', 'colors': ['#f8bc9c', '#f6e9a1', '#f5f8f2', '#92d9f5', '#4f97ba']}, 'channels_activity': {'plot_title': 'Channel activity over time', 'colorscale': [[0.0, 'rgba(255,255,255,0)'], [0.01, 'rgb(255,255,200)'], [0.25, 'rgb(255,200,0)'], [0.5, 'rgb(200,0,0)'], [0.75, 'rgb(120,0,0)'], [1.0, 'rgb(0,0,0)']], 'smooth_sigma': 1}} Running method summary summary ({'plot_title': 'Run summary'}) Plotting overall reads summary Preparing data for all reads Preparing data for pass reads Running method barcode_summary barcode_summary ({'plot_title': 'Run summary by barcode'}) No barcode information available Running method run_id_summary run_id_summary ({'plot_title': 'Run summary by Run ID'}) Plotting reads summary by run_id Preparing data for all reads Preparing data for pass reads Running method reads_len_1D reads_len_1D ({'plot_title': 'Distribution of read length', 'color': 'lightsteelblue', 'nbins': 200, 'smooth_sigma': 2}) Plotting read length distribution Preparing data for all reads and num_bases Preparing data for pass reads and num_bases Running method reads_qual_1D reads_qual_1D ({'plot_title': 'Distribution of read quality', 'color': 'salmon', 'nbins': 200, 'smooth_sigma': 2}) Plotting read quality distribution Preparing data for all reads and mean_qscore Preparing data for pass reads and mean_qscore Running method reads_len_qual_2D reads_len_qual_2D ({'plot_title': 'Mean read quality per sequence length', 'colorscale': [[0.0, 'rgba(255,255,255,0)'], [0.1, 'rgba(255,150,0,0)'], [0.25, 'rgb(255,100,0)'], [0.5, 'rgb(200,0,0)'], [0.75, 'rgb(120,0,0)'], [1.0, 'rgb(70,0,0)']], 'len_nbins': 200, 'qual_nbins': 75, 'smooth_sigma': 2}) Plotting read length vs read quality 2D distribution Preparing data for all reads Preparing data for pass reads Running method output_over_time output_over_time ({'plot_title': 'Output over experiment time', 'cumulative_color': 'rgb(204,226,255)', 'interval_color': 'rgb(102,168,255)'}) Plotting sequencing output over experiment time Preparing data for all reads Preparing data for pass reads Preparing data for all bases Preparing data for pass bases Running method len_over_time len_over_time ({'plot_title': 'Read length over time', 'median_color': 'rgb(102,168,255)', 'quartile_color': 'rgb(153,197,255)', 'extreme_color': 'rgba(153,197,255,0.5)', 'smooth_sigma': 1}) Plotting read length over experiment time Preparing data for all reads and num_bases Preparing data for pass reads and num_bases Running method qual_over_time qual_over_time ({'plot_title': 'Mean read quality over time', 'median_color': 'rgb(250,128,114)', 'quartile_color': 'rgb(250,170,160)', 'extreme_color': 'rgba(250,170,160,0.5)', 'smooth_sigma': 1}) Plotting read quality over experiment time Preparing data for all reads and mean_qscore Preparing data for pass reads and mean_qscore Running method barcode_counts barcode_counts ({'plot_title': 'Number of reads per barcode', 'colors': ['#f8bc9c', '#f6e9a1', '#f5f8f2', '#92d9f5', '#4f97ba']}) Plotting barcode distribution No barcode information available Running method channels_activity channels_activity ({'plot_title': 'Channel activity over time', 'colorscale': [[0.0, 'rgba(255,255,255,0)'], [0.01, 'rgb(255,255,200)'], [0.25, 'rgb(255,200,0)'], [0.5, 'rgb(200,0,0)'], [0.75, 'rgb(120,0,0)'], [1.0, 'rgb(0,0,0)']], 'smooth_sigma': 1}) Plotting channel activity Preparing data for all reads Preparing data for pass reads Preparing data for all bases Preparing data for pass bases Loading HTML template Read default jinja template Rendering plots in d3js Writing to HTML file (/workspace/appscratch/miniconda/pycoQC)

a-slide commented 4 years ago

Hi @hrpelg

You are using an old version of pycoQC. Could you update to the latest one? It is now available on bioconda as well

I also have the impression that the Bam file is read, but that no matching reads are found.

Thanks

a-slide commented 4 years ago

Assumed fixed