How do I run UNAGI with multiple samples?

iMetOsaka / UNAGI

3 stars 4 forks source link

How do I run UNAGI with multiple samples? #12

Open chaturvedi-lab opened 1 year ago

chaturvedi-lab commented 1 year ago

Hi there,

I have nanopore cDNA data from four samples and I would like to run UNAGI on these samples. How do I do this? Do I: 1) run UNAGI individually for each sample? If so how do I concatenate the results? 2) Combine the reads from 4 samples and run UNAGI?

Thanks, Sam

Kyomari commented 1 year ago

To run UNAGI on your nanopore cDNA data from four samples, you can follow these general steps:

Preprocessing: Adapter trimming and quality control: Use a tool like Porechop or Guppy to remove adapters and trim low-quality reads. You can also use tools like NanoPlot or FastQC to assess the quality of the reads before and after trimming.

Alignment: Use a tool like minimap2 to align the trimmed reads to the reference genome or transcriptome. This will generate a BAM file for each sample.

Transcript quantification: Run UNAGI on each sample separately using the aligned BAM file as input. UNAGI will quantify the expression of transcripts based on the reads that align to them.

I hope this helps.

chaturvedi-lab commented 1 year ago

Thank you for the quick reply! I just want to confirm one last thing. Do I run UNAGI with the bam files as follows (similar to the fastq file):

unagi -i sample.sorted.bam -g genome.fasta -o outdir

Kyomari commented 1 year ago

Yes. Here, sample.sorted.bam is the aligned BAM file for your sample, genome.fasta is the reference genome or transcriptome in FASTA format, and outdir is the directory where UNAGI will output the results.

chaturvedi-lab commented 1 year ago

Hi there,

I ran unagi with the sorted bams and I get the following error:

[2023/05/05 - 15:09:25] Reading the input file [2023/05/05 - 15:09:25] The input file must be in the fastq or fastq.gz format.

Thanks, Sam

Kyomari commented 1 year ago

Based on the error message, it seems like unagi is expecting a fastq or fastq.gz file as input, but instead it's being given a sorted BAM file. You'll need to convert your sorted BAM file to a fastq file. You can do this using a tool like samtools, which can convert a BAM file to a fastq file. Here's an example command in lua:

samtools fastq input.bam > output.fastq

Replace input.bam with the path to your sorted BAM file and output.fastq with the desired name and path for the resulting fastq file.