Missing default value for fusion detection

wangyiqing50 commented 4 years ago

I noticed fusion detection has following parameters but they are not set in config file FUSION_MAX_ERROR_RATE = config["fusion_max_error_rate"] FUSION_MIN_SCORE_DIFFERENCE = config["fusion_min_score_difference"]

lane-zhao commented 4 years ago

Hi wangyiqing, after add these value manually, can you run this pipeline correctly ? Did you also have many warnings like：”WARNING: The graph has an edge between non-existant node(s)”， and can not get the gam format file ?

wangyiqing50 commented 4 years ago

No, it doesn't work. The error message is below:

Building DAG of jobs... MissingInputException in line 41 of /data/user/yiqing19/aeron/Aeron/Snakefile: Missing input files for rule align: input/S

It says some input files are missing. I guess I misconfigured some files in input folder but I have no idea what should 'input/S' be

wangyiqing50 commented 4 years ago

By the way, what are your recommend values for those parameters if I use CCS read?

ddurai commented 4 years ago

No, it doesn't work. The error message is below:

Building DAG of jobs... MissingInputException in line 41 of /data/user/yiqing19/aeron/Aeron/Snakefile: Missing input files for rule align: input/S

It says some input files are missing. I guess I misconfigured some files in input folder but I have no idea what should 'input/S' be

It seems that there are some special characters in the input file. Can you please share your config file or copy-paste it here.

wangyiqing50 commented 4 years ago

# use the full file name, including file ending

# input splice graph # Should be in the input folder # format must be .vg graph: hg19.gfa

# reference transcripts # format can be either fasta/fastq, gzipped or not # Should be in the input folder

transcripts: gencode.v33lift37.transcripts.fa

# sequenced reads # Should be in the input folder # format can be either fasta/fastq, gzipped or not # for more files, add them in new lines starting with "- " # NOTE: the file names without ending must be unique! You cannot have eg. reads.fq and reads.fa reads: SRR7346977.fa

#optional params below: default values will probably work

#size of the seed hits. Fewer means more accurate but slower alignments. seedsize: 17 #max number of seeds. Fewer means faster but more inaccurate alignment maxseeds: 20 fusion_max_error_rate: 0.8 fusion_min_score_difference: 10 alignment_selection: --greedy-length #Do not change alignment_E_cutoff: 1

#bandwidth for the aligner. Higher means more accurate but slower alignment. aligner_bandwidth: 35

gtffile: gencode.v33lift37.annotation.gtf

#file paths

# https://bitbucket.org/dilipdurai/aeron/ scripts: Aeronscript/ # https://github.com/maickrau/GraphAligner binaries: Binaries/ # needed to convert mummer seeds to .gam seeds vgpath: /home/yiqing19/vg

maickrau commented 4 years ago

Hi wangyiqing, our config file was missing these two parameters. We have added them to the file. The recommended defaults are

fusion_max_error_rate: 0.8 fusion_min_score_difference: 200

Let us know if it works

wangyiqing50 commented 4 years ago

Hi wangyiqing, our config file was missing these two parameters. We have added them to the file. The recommended defaults are

fusion_max_error_rate: 0.8 fusion_min_score_difference: 200

Let us know if it works

SchulzLab / Aeron

Missing default value for fusion detection #2