Closed chaoyang666 closed 1 year ago
The most important question for this is: did you construct, index and align an N-masked genome (e.g. with SNPsplit_genome_preparation
) before running the bisulfite alignments. From looking at the MD:Z:
string I can't see any mismatches to N
s, that's why I'm asking.
A bit further up in the SNPsplit report you should also see the stat about how many reads contained mismatches to Ns. If that is 0 it would also tell you whether that was the reason...
Dear Felix, Thank you very much. Could you please help me; Why I get the error information like as:
Reading/filtering VCF file: No (skipped by user)
Reference genome: 01-maize/split/
N-masking: Yes
Full SNP genome: Yes
SNP strain: maize
Using the following chromosomes (HARCODED IN!!!):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y MT
Skipped reading the VCF file and filtering SNPs again (specified by user)
Now reading in and storing sequence information of the genome specified in: 01-tools/03-genome/01-maize/split/
chr Chr10 (150982314 bp)
chr Chr1 (307041717 bp)
chr Chr2 (244442276 bp)
chr Chr3 (235667834 bp)
...............
...............
[FATAL ERROR] Please ensure that the same version of the genome is used for both VCF annotations and reference genome (FastA files). Exiting...
I run like this:
SNPsplit_genome_preparation --skip_filtering --ref 01-maize/split --strain maize --full_sequence
I have the following files in the folder: SNPs_maize chr10.txt chr1.txt chr2.txt chr3.txt chr4.txt chr5.txt chr6.txt chr7.txt chr8.txt chr9.txt
chr1.txt >Chr1 796078 Chr1 26291 1 T/C 796079 Chr1 26361 1 C/A 796080 Chr1 26421 1 C/T 796081 Chr1 26442 1 C/T 796082 Chr1 26605 1 C/T 796083 Chr1 26650 1 A/G
my reference genome is: >Chr10 CCTAAACCCTAAACCCTAAAC
Thank you very much! Chao
Dear Felix,
So far, I have solved this problem by modifying the code of SNPsplit-0.3.4/SNPsplit_genome_preparation `
@chroms = ('Chr1','Chr2','Chr3','Chr4','Chr5','Chr6','Chr7','Chr8','Chr9','Chr10');
` Thanks a lot, Chao
Hi:
I have a problem. Could you please help me with this? I can't split my bam file with SNPsplit and can't find the reasons.
My run code is as below: SNPsplit --bisulfite --snp_file B73_V4.snp.SNP my.bam
input file1 B73_V4.snp.SNP 1 Chr10 92742 1 G/C 2 Chr10 92778 1 A/C 3 Chr10 92928 1 T/A 4 Chr10 92939 1 A/T 5 Chr10 92942 1 A/G 6 Chr10 92975 1 T/G 7 Chr10 92986 1 T/A 8 Chr10 93130 1 A/T 9 Chr10 93213 1 T/C 10 Chr10 93224 1 C/T input file2 my.bam
££££££££££££££££
OUTPUT: Allele-tagging report
Processed 158277819 read alignments in total Reads were unaligned and hence skipped: 0 (0.00%) 158277819 reads were unassignable (100.00%) 0 reads were specific for genome 1 (0.00%) 0 reads were specific for genome 2 (0.00%) 0 reads that were unassignable contained C>T SNPs preventing the assignment 1528 reads did not contain one of the expected bases at known SNP positions (0.00%) 0 contained conflicting allele-specific SNPs (0.00%)
Many thanks, Chao 13/07/2022