ococrook / hdxstats

hdxstats: An R-package for statistical analysis of hydrogen deuterium exchange mass-spectrometry data.
Apache License 2.0
5 stars 2 forks source link

Manhatten plot - multiple charge states #8

Open nlgittens opened 2 years ago

nlgittens commented 2 years ago

Generating a Manhatten plot with some alternative data, I encounter this issue:-

Error in$<-.data.frame(tmp, "protection", value = c(XXXXXXX_2 = 1, : replacement has 96 rows, data has 91

So from what I can gather, the issue is that the original data frame has 96 unique peptides, but the HdxStatsRes object, the function only sees 91. This is because in HdxStatsRes, the charge state is concatenated in; however, region = protein[, c("Start", "End")] does not take into account that you might have several peptides in the dataset with the same ID but different charge state. Is it as easy a fix as simply adding in "Charge" to that line of code? I haven't managed to get the Manhatten plot with my data, so haven't managed to test that out. (I'm getting some errors with that I haven't been able to resolve yet)

ococrook commented 2 years ago

Hi Nathan, yes this will be a charge related issue. Could you send me a small example and then I can fix the function?

nlgittens commented 2 years ago

Yes, I'll put something in our fileshare today.

ococrook commented 2 years ago

this should be fixed in the lastest issue. If you just pick one charge state everything should be fine, need to work out what we visualise when there are multiple charge states.

nlgittens commented 2 years ago

Thanks for working on that Olly; I'm till getting an issue in which R is removing data for 91 peptides (which is all of them) in the Manhattan plot. The diffdata, region and sequences objects all contain data. I don't know if this arning message contains any immediate ideas:-

"Warning message: Removed 91 rows containing missing values (geom_point)."

nlgittens commented 2 years ago

I fixed the issue now; although no get a problem where there are more peptides (plotted on the x-axis) than p-values; hdxstats has not left out a gap / removed peptides with no data, and so now there ar a bunch of peptides at the end with no data, and presumably the other p-values are now misaligned

ococrook commented 2 years ago

Hard to see how this happens without some data, anything you could share?

nlgittens commented 2 years ago

I put an example in the fileshare that should be more helpful.

nlgittens commented 2 years ago

manhattan/multiple charge states reminder