bmansfeld / QTLseqr

QTLseqr is an R package for QTL mapping using NGS Bulk Segregant Analysis
64 stars 42 forks source link

a new question #31

Closed liuxiaowei0401 closed 4 years ago

liuxiaowei0401 commented 4 years ago

i have calculate the snp-index of bulk_1 and bulk_2,and i take the office-excel to plot the picture,but i didn't plot the CI of 95% and 99%,i sincerely hope you can give me a script to sovle my trouble. the date is as follows:

image and i have plot the picture as follows: ! Rplot01-1-1

but i need to plot the picture which contains the line of 95% CI and 99% CI, so i want to know can i use the data i povided above to cauculate the CI of 95% and 99% ?

bmansfeld commented 4 years ago

Hi again, QTLseqr as an option to import data from a csv file. The function is called importFromTable(). The use of the function is described in the vignette. Briefly, it accepts data in a pretty straight forward table, which you should be able to generate if you have the above data. From the vignette:

Your file must include some necessary columns: CHROM (Chromosome names) and POS (the SNP position) as well as the reference and alternate allele depths (number of reads supporting each allele). The allele depths should be in columns named in this format: AD_<ALT/REF>.. For example, the column for alternate allele depth for a high bulk sample named “sample1”, should be “AD_ALT.sample1”. Any other columns describing the SNPs are allowed, ie the actual allele calls, or a quality score. If the column is Bulk specific, It should be named columnName.sampleName, i.e “QUAL.sample1”.

Which gets me to talk about your dataframe that you posted. It seems that you are calculating snp-index using total depths of the samples. ie

snp-index = Bulk1-Depth / (Bulk1-Depth) + (Bulk2-Depth)

This is not correct and as far as I see doesn't include Alternate and Reference allele depths for each bulk. You need the read count for each allele to perform this analysis. This is also why in your figure it seems that you are plotting the "snp-index" and not the delta-snp-index. Which is what gets a confidence interval.

Instead of me writing a script for you to perform the analysis, I recommend you use the above approach and import the correct data format into the software. It should then take you only a short time to run the scripts and get the result you want.

Though, based on your previous question (#30) it seems you aren't using two phenotypic bulks but rather comparing one bulk to a reference line. Similar to a mapping-by-sequencing approach (MutMap or SHORE mapping). If that is the case, our software doesn't currently offer support for this kind of bulk segregant analysis, though the approaches are similar.

Let me know if you have any questions. And again please read the entire vignette.. sorry to be annoying about this, but I really spent a lot of time there to explain things so that I wouldn't need to do it here.

Good luck, Ben

liuxiaowei0401 commented 4 years ago

thank so much for you,indeed,my question spent a lot of time for you.I probably understand the answer I want to ask.thank you ,my friends. You should be a good ‘teacher’.

good luck, liu ’