input isoforms.fasta for chain_samples.py

Magdoll / SQANTI2

SQANTI2 is now replaced by SQANTI3. Please go to: https://github.com/ConesaLab/SQANTI3

Other

38 stars 15 forks source link

input isoforms.fasta for chain_samples.py #52

Closed pcarbone closed 4 years ago

pcarbone commented 4 years ago

Hi Liz,

Sorry for my misunderstanding, which input isoforms.fasta is needed when using a multi-sample FL Count file produced by the chain_samples.py? I generated a multisample FL count from 10 multiplexed tissues. I have collapsed fasta files for each demultiplexed sample. Shall I generate somehow a merged fasta for all 10-plex as input isoforms for SQANTI2 to be analyzed with the multi-sample FL count data?

Thank you. Pablo

Magdoll commented 4 years ago

Hi @pcarbone , You will need a unified ID for multi-sample FL count file.

Since you have collapsed fasta/count files for each independent sample, you can chain them together using Cupcake.

Note that in the future, if you have multiplexed tissues from the same organism, another way to run the data is to run them pooled first (after removing cDNA primers, barcodes, and polyA tails) then use the demux script to get per-tissue counts later. This approach generally yields slightly more isoforms because the isoseq3 pipeline requires seeing 2 FL reads (regardless of source of tissue) to call an isoform, hence, low abundance isoforms that may be present in only 1 FL copy per tissue will not be called when analyzed by tissue but will be recovered when analyzed with tissues pooled.

-Liz

pcarbone commented 4 years ago

Dear Liz,

Thank you for your rapid reply and and your useful scripts.

I already chained FL counts using cDNA_Cupcake/build/scripts-3.7/chain_samples.py. However, I guess the problem is that only chained count but no chained fasta were produced by chain_samples.py because I did not include any FASTQ_FILENAME=optional.rep.fastq in the config as input. All demux sample "isoseq3 collapse" runs produced an error and only fasta but no fastq were written. This is why I cannot input fastq into chain_samples.py.

Do you think that the most convenient solution at this point would be processing the pooled data and running demux script afterwards as you suggested?

Thanks again, Pablo

Magdoll commented 4 years ago

Hi @pcarbone , You can convert fasta to fastq using fa2fq.py in Cupcake. tutorial

-Liz

pcarbone commented 4 years ago

Hi Liz, The fa2fq.py solution worked and the downstream SQANTI2 from the chained samples as well! Thank you! Pablo