I have individual fastq.gz , each fastq.gz file is a barcode.
When I use the classic NanoPlot command, I get an individual folder (containing summary, plots, etc.) for each barcode. Is there a way to concatenate all these NanoStats.txt files into one large table where the columns represent my different barcodes and the rows represent the different metrics?
def check_quality(self):
# Create the output folder if it doesn't exist
if not os.path.exists(self.output_folder):
os.makedirs(self.output_folder)
# Get a list of all FASTQ files in the input folder
fastq_files = [f for f in os.listdir(self.input_folder) if f.endswith('.fastq') or f.endswith('.fastq.gz')]
# Run NanoPlot for each FASTQ file
for fastq_file in fastq_files:
input_path = os.path.join(self.input_folder, fastq_file)
output_path = os.path.join(self.output_folder, fastq_file.split('.')[0]) # Output path construction
nanoplot_cmd = f'NanoPlot --fastq {input_path} -o {output_path} --threads {self.num_threads}'
subprocess.run(nanoplot_cmd, shell=True)
if name == "main":
input_folder = '/data/fastq'
output_folder = '/data/quality_plots'
num_threads = 6 # User to specify the number of threads
Hello,
I have individual fastq.gz , each fastq.gz file is a barcode. When I use the classic NanoPlot command, I get an individual folder (containing summary, plots, etc.) for each barcode. Is there a way to concatenate all these NanoStats.txt files into one large table where the columns represent my different barcodes and the rows represent the different metrics?
this is the code I'm using right now :
import os import subprocess
class NanoPlotQualityChecker: def init(self, input_folder, output_folder, num_threads): self.input_folder = input_folder self.output_folder = output_folder self.num_threads = num_threads
if name == "main": input_folder = '/data/fastq' output_folder = '/data/quality_plots' num_threads = 6 # User to specify the number of threads