CambridgeCentreForProteomics / camprotR

https://cambridgecentreforproteomics.github.io/camprotR/
MIT License
4 stars 0 forks source link

Sequence columns #36

Open TomSmithCGAT opened 2 years ago

TomSmithCGAT commented 2 years ago

PD output sometimes includes Annotated Sequence but not sequence. Some functions, e.g https://github.com/CambridgeCentreForProteomics/camprotR/blob/6058ca4acbaf071fc2b0f019adfe7001b043f150/R/ptm.R#L153 need the 'unannotated' sequence.

We should include a function to create unannotated sequence... obj$Sequence <- toupper(sapply(strsplit(obj$Annotated.Sequence, '\\.'), '[[', 2))

And add some checks to ensure more sensible error thrown if annotated sequence column is provided.