Closed adder closed 5 months ago
Hello @adder , I am from the SomaLogic Global Scientific Engagement team and can answer your questions! Each SeqId is a unique numeric identifier that has a 1:1 relationship with each SOMAmer. Protein Target names and UniProt ID's are also included in SomaScan .adat files to better identify each protein. As SOMAmers can be designed for specific proteoforms and regions of proteins, UniProt ID does not always properly distinguish the protein target. Target Full Name in the .adat is a better distinguisher when the SOMAmer is targeting a known proteoform. In the event where the Target Full Name is also duplicated, the amino acid region and RefSeq ID for the protein construct can be found at https://menu.somalogic.com after logging in, by clicking on an individual protein or by clicking the "Export" button to get the full list as an Excel file. Since the SOMAmers with duplicated UniProt ID's can represent different features of the protein, collapsing the values is not the best approach. If both SOMAmers are representing the total concentration of the protein under the study conditions, then this would not be entirely detrimental, but if they are measuring different isoforms or regions of the protein, then collapsing them may lose some significant results. We recommend treating each analyte as an independent measurement. If you have any more questions, feel free to email techsupport@somalogic.com with additional inquiries. The Global Scientific Engagement team is available to respond via email or video call. Thank you for your questions and using our R package!
Thanks for your detailed answer!
Hey,
Maybe this is a more general question about the somalogic platform but I encountered it while analyzing the data with somadataIO. I noticed that some "seqId" correspond to multiple uniprot ids. So this would mean that 1 protein can be represented by multiple aptamers (at least form some proteins?) If this is indeed the case, where could I find more information on this? Why are there proteins with multiple aptamers? Is it wise to combine these values so I have 1 measurement per protein (eg. by averaging the RFUs per protein form the same sample)? Or am I missing something?
Thanks for this nice package!