Closed olli0601 closed 5 years ago
Fairly simple - "sibling" = "none" (i.e. no ancestry), "intermingled" = "complex"
Hi Matthew
The reason I was asking is that you assign "complex" in some pairs when paths12>0 and paths21==0 and adjacent==TRUE. For these pairs, I think you still have evidence of directionality, and thus I assign those pairs to "12".
If you buy into this argument, "intermingled" reduces to those pairs with paths12>0 and paths21>0 and adjacent==TRUE. That s what I did for the Rakai analysis. All pairs who are "intermingled" have ambiguous evidence of direction.
Following the examples code in ?classify.pairwise.relationships, you can then check this with the line dwin %>% filter(paths12>0 & paths21==0 & adjacent)
Happy to discuss, olli
Is allow.mt == TRUE when you get "complex" with paths12>0 and paths21==0? That's what that option should control (if FALSE then "multiAnc" and "multiDesc" are changed to "complex", and this is the default though whether it should be is open to debate). If that occurs when it is TRUE then something is wrong and I'll take a look.
I just checked. As far as I can see allow.mt <- TRUE in function "classify.pairwise.relationships" of this branch works as before in your code. Though would be great if you could double check!
Hi Matthew
I am currently checking the code, and running into some issues. When I run the following, why is the female declared ancestral to the male?
` require(phyloscannerR) require(tidyverse)
file <-system.file(file.path('extdata','ptyr192_phsc_analyse_trees_output.R'),package='phyloscannerR') load(file) #loads 'phsc', output from 'phyloscanner.analyse.trees'
close.threshold <- 0.025
distant.threshold <- 0.05
tip.regex <- "^(.*)_fq[0-9]+read([0-9]+)count([0-9]+)$"
min.reads <- 30
min.tips <- 1
ans <- multinomial.calculations(phsc,
close.threshold=close.threshold,
tip.regex=tip.regex,
allow.mt=TRUE,
min.reads=min.reads,
min.tips=min.tips,
distant.threshold = distant.threshold,
relationship.types = c('proximity.3.way',
'any.ancestry',
'close.x.contiguous',
'close.and.contiguous',
'close.and.contiguous.and.directed',
'close.and.adjacent.and.directed',
'close.and.contiguous.and.ancestry.cat',
'close.and.adjacent.and.ancestry.cat',
'adjacent.and.proximity.cat'),
verbose=TRUE)
dwino<- ans$dwin
dco<- ans$rplkl
str(subset(dwino, host.1=='RkA05193F' & host.2=='RkA06730M' & tree.id=='1025_to_1274'))
`
I can't run that code in either branch without errors, but that doesn't surprise me as a classification in that tree. There will be one RkA06730M subgraph descended from what looks like a RkA05193F subgraph. (RkA05210M makes the situation questionable, but the classification seems right.)
Hi Matthew
What is the error in running the above? I am in the migrate_counts branch, then source the multinomial code in the "deprecated" folder of that branch manually.
What I am confused about in the above window is that we have paths12==0 & paths21>0 and ancestry=='anc'. Why would that be?
Thanks.
The error:
Calculating k.eff and n.eff for windows (n=0)...
Error in mutate_impl(.data, dots) :
Evaluation error: object 'k' not found.
In addition: There were 50 or more warnings (use warnings() to see the first 50)
The problem with the table was that I neglected to change the order of all the columns when ensuring that host.1
and host.2
are in alphabetical order, which I've now fixed on the main branch. Sorry. The paths columns weren't subsequently used before.
However, I think in the long run (although not now if it means you would have to rerun anything) I'd prefer to change it so the classify
function knows about the variations and calls "anc"/"desc" based on the different particular variation asked for, which will allow me to pull the allow.mt option out of merge.classifications as well. Feels cleaner. Is this OK? Then in your case you just need to pay attention to "anc"/"desc" not the path counts.
E.g
classify <- function(ptree, allow.mt = F, strict.ancestry = T, verbose = F, no.progress.bars = F) {...
where strict.ancestry
is your option. If that is FALSE then an "anc" will obey your rules.
Hi Matthew
Thanks for the fix. I tested the new branch on Rakai data, and it now returns identical results.
Some changes:
.cat
Would you be able to edit your analyse_trees.Rscript
so that it saves dwin
and dc
of the following lines to one or two (as you prefer) rda files as below. When done, I can then re-run on the entire RCCS data.
close.threshold <- 0.025
distant.threshold <- 0.05
relationship.types <- c('proximity.3.way',
'any.ancestry',
'close.x.contiguous',
'close.and.adjacent',
'close.and.adjacent.and.directed',
'close.and.adjacent.and.ancestry',
'close.and.contiguous',
'close.and.contiguous.and.directed',
'close.and.contiguous.and.ancestry')
use.paths.to.define.relationships <- FALSE
tip.regex <- "^(.*)_fq[0-9]+_read_([0-9]+)_count_([0-9]+)$"
min.reads <- 30
min.tips <- 1
verbose <- TRUE
dwin <- classify.pairwise.relationships(phsc,
allow.mt=allow.mt,
close.threshold=close.threshold,
distant.threshold=distant.threshold,
use.paths.to.define.relationships=use.paths.to.define.relationships,
relationship.types=relationship.types,
verbose=TRUE)
tmp <- select.windows.by.read.and.tip.count(phsc, dwin, tip.regex, min.reads, min.tips)
dc <- count.pairwise.relationships(tmp)
Thank you! olli
There is already an option to save the entire workspace as an RDA, so I'd prefer to just add an option to perform these calculations as part of phyloscanner_analyse_trees.R. Then dwin
and dc
would be available along with everything else if --rda
is specified.
Hi Matthew
I have been preparing an update of the functions in "multinomial_counts.R". What did I do:
Let s discuss on Wednesday, though it would be good if you could follow the example in 'count.pairwise.relationships' beforehand, and review the classifications in file 'classify.pairwise.relationships'. I don t really understand your ancestry=c("none", "complex"), and how this fits with "sibling"/"intermingled".
Cheers
olli