Closed SalvatoreRa closed 6 years ago
Hi Salvo,
The higher absolute score in siteset, the better match. The relative score is calculated by absolute score / maximal score from the profile. The min.score sets the minimal relative score for the results.
The p-value from TFMPvalue is more accurate, however, slower to computer in some cases.
For more information, please refer to
Wasserman, W. W., & Sandelin, A. (2004). Applied bioinformatics
for the identification of regulatory elements. Nature Publishing
Group, 5(4), 276-287. doi:10.1038/nrg1315
Ge
Thank you very much
Hi,
I scanned a nucleotide sequence with a PWM pattern and I obtained the score and the p-values.
I copied this example just changing the transcription factor and the DNA sequence used:
example
library(Biostrings) data(MA0004.1) subject <- DNAString("GAATTCTCTCTTGTTGTAGTCTCTTGACAAAATG") siteset <- searchSeq(pwm, subject, seqname="seq1", min.score="60%", strand="*")
I calculated the score
head(writeGFF3(siteset)) relScore(siteset)
I calculate the pValue
pvalues(siteset, type="TFMPvalue") pvalues(siteset, type="sampling")
I obtained around 50 sequences with a score from -7 to 5, I would like to know how to interpeter this score, how to choose the best matches. Which is the best score? the highest score? the most negative? there is a theresold to consider? or I should consider the relative score? which of the two pvalue method for you is the most accurate?
Thank you for your help,
Salvo