Open sandaruwanrat opened 2 years ago
Hello,
cluster_1329_ovlp_mapping_test_fwd.paf
is empty.TAIR10_chr_all.fasta
? Not sure I understand what you are trying to achieve.Best regards, Robert
Hello Robert,
1) Both .paf and .sam files are not empty files.
2) TAIR10_chr_all.fasta is a genome file of Arabidopsis. It has five contigs. Following are the length of each contig.
chr1 30427671 chr2 19698289 chr3 23459830 chr4 18585056 chr5 26975502 chrM 367808
My aim is to collapse sequences in the each cluster file (ex: cluster_9.fasta) and get a consensus sequence. Each of these sequences in a "cluster_XXX.fasta" file should belongs to same genomic region. I would like to know if I am doing something wrong.
Thank you.
Best Regards Sandaruwan
How big are the genomic regions of each cluster?
It varies. The mean length of some are 250 bp, 950bp 1.1kb, basically I have different clusters from 250bp to 2.5 kb
And how did you obtain the clusters? You might try https://github.com/rvaser/spoa instead of Racon.
I obtained clusters based on UMIs, I have used https://github.com/fhlab/UMIC-seq to get the clusters. But https://github.com/SorenKarst/longread_umi/blob/master/scripts/consensus_racon.sh have used racon to get consensus sequences from clusters.
I will try spoa instead of racon.
Thank you very much.
Best regards Sandaruwan
Hello,
My question has two parts
1.When I run the command
racon -m 8 -x -6 -g -8 -w 500 cluster_1329.fasta cluster_1329_ovlp_mapping_test_fwd.paf TAIR10_chr_all.fasta > cluster_1329_tmp_consensus_test3.fasta
I am getting the following error. [racon::Polisher::initialize] loaded target sequences [racon::Polisher::initialize] loaded sequences [racon::Polisher::initialize] error: empty overlap set!However, I only get this for some of the cluster files others works fine.
Below are my minmap2 commands, i have used both paf and sam formats
minimap2 -x map-ont -t 1 -uf TAIR10_chr_all.fasta cluster_1329.fasta > cluster_1329_ovlp_mapping_test_fwd.paf
minimap2 -ax map-ont -t 1 -uf TAIR10_chr_all.fasta cluster_1329.fasta > cluster_1329_ovlp_mapping_test_fwd.sam
2. I get multiple consensus sequences
As I mentioned in the part one, racon generates consensus sequences for some cluster files for the same command
racon -m 8 -x -6 -g -8 -w 500 cluster_9.fasta cluster_9_ovlp_mapping_test_fwd.paf TAIR10_chr_all.fasta > cluster_9_tmp_consensus_test3.fasta
But the problem is there are more than one sequence (I have put example below) in the output`>chr1 LN:i:30427560 RC:i:155 XC:f:0.000049 Sequnce
I would greatly appreciate your feedback on this. Thank you very much.