Closed lgatto closed 5 years ago
Ping @ococrook - do you want to do this? Is so, feel free to assign the issue to yourself.
@lgatto The data are provided with gene names rather than Uniport ID's. Do you recommend a reliable tool to convert them - I imagine you can use biomart?
Either biomart or you can directly use uniprot at https://www.uniprot.org/uploadlists/
What naming convention should we use the the dataset? We have Orre2019 following convention, but there are 9 different datasets, so cell-line should be in title. Do we want A431Orre2019 or a431Orre2019 or Orre2019A431 etc.
I would prefer orre2019
followed by the cell line, every letter lowercase:
orre2019a431
orre2019mcf7
...
Thanks, Laurent from above. I have some cases where gene names have NA as uniprot ID - any preference what to do in these cases?
You could try to figure out what version of Uniprot they used, which could possibly help.
Otherwise
Option 2 above is the fastest and is sensible.
I'll probably aim for option 2. Uniport keeps crashing too - probably asking for too many proteins at a time. Thanks laurent!
SubCellBarCode: Proteome-wide Mapping of Protein Localization and Relocalization
Subcellular localization is a main determinant of protein function; however, a global view of cellular proteome organization remains relatively unexplored. We have developed a robust mass spectrometry-based analysis pipeline to generate a proteome-wide view of subcellular localization for proteins mapping to 12,418 individual genes across five cell lines. Based on more than 83,000 unique classifications and correlation profiling, we investigate the effect of alternative splicing and protein domains on localization, complex member co-localization, cell-type-specific localization, as well as protein relocalization after growth factor inhibition. Our analysis provides information about the cellular architecture and complexity of the spatial organization of the proteome; we show that the majority of proteins have a single main subcellular location, that alternative splicing rarely affects subcellular location, and that cell types are best distinguished by expression of proteins exposed to the surrounding environment. The resource is freely accessible via www.subcellbarcode.org.
https://www.cell.com/molecular-cell/fulltext/S1097-2765(18)31005-0