LuyiTian / FLAMES

Full-length transcriptome splicing and mutation analysis
GNU General Public License v3.0
73 stars 11 forks source link

Isoform parameters #29

Closed jchang97 closed 2 years ago

jchang97 commented 2 years ago

Hi there,

Thank you so much for this amazing tool!

I am just wondering if it is possible to get a more in-depth explanation of each parameter for the config file e.g. for isoform parameters?

Thank you

ChangqingW commented 2 years ago

Here is a list of descriptions from the FLAMES R package

do_genome_align - Boolean. Specifies whether to run the genome alignment step. TRUE is recommended
do_isoform_id - Boolean. Specifies whether to run the isoform identification step. TRUE is recommended
do_read_realign - Boolean. Specifies whether to run the read realignment step. TRUE is recommended
do_transcript_quanti - Boolean. Specifies whether to run the transcript quantification step. TRUE is recommended
gen_raw_isoform - Boolean.
has_UMI - Boolean. Specifies if the data contains UMI.
max_dist - Maximum distance allowed when merging splicing sites in isoform consensus clustering.
max_ts_dist - Maximum distance allowed when merging transcript start/end position in isoform consensus clustering.
max_splice_match_dist - Maximum distance allowed when merging splice site called from the data and the reference annotation.
min_fl_exon_len - Minimum length for the first exon outside the gene body in reference annotation. This is to correct the alignment artifact
max_site_per_splice - Maximum transcript start/end site combinations allowed per splice chain
min_sup_cnt - Minimum number of read support an isoform decrease this number will significantly increase the number of isoform detected.
min_cnt_pct - Minimum percentage of count for an isoform relative to total count for the same gene.
min_sup_pct - Minimum percentage of count for an splice chain that support a given transcript start/end site combination.
strand_specific - 0, 1 or -1. 1 indicates if reads are in the same strand as mRNA, -1 indicates reads are reverse complemented, 0 indicates reads are not strand specific.
remove_incomp_reads - The strenge of truncated isoform filtering. larger number means more stringent filtering.
use_junctions - whether to use known splice junctions to help correct the alignment results
no_flank - Boolean. for synthetic spike-in data. refer to Minimap2 document for detail
use_annotation - Boolean. whether to use reference to help annotate known isoforms
min_tr_coverage - Minimum percentage of isoform coverage for a read to be aligned to that isoform
min_read_coverage - Minimum percentage of read coverage for a read to be uniquely aligned to that isoform
jchang97 commented 2 years ago

Oh amazing thank you very much!