MEP-LINCS / MEPDataExplorer

This application allows visualization of MEP-LINCS data for each cell line.
https://www.synapse.org/#!Synapse:syn2862345/wiki/410135
MIT License
0 stars 0 forks source link

curated feature names do not match with data #21

Open kdaily opened 7 years ago

kdaily commented 7 years ago

some feature names are not in columns, such as:

 [1] "CellLine"                                                         
 [2] "Barcode"                                                          
 [3] "Spot"                                                             
 [4] "Drug"                                                             
 [5] "StainingSet"                                                      
 [6] "395nm"                                                            
 [7] "488nm"                                                            
 [8] "555nm"                                                            
 [9] "640nm"                                                            
[10] "750nm"                                                            
[11] "Spot_PA_SpotCellCount"                                            
[12] "Nuclei_CP_Intensity_MedianIntensity_Dapi"                         
[13] "Nuclei_CP_Intensity_IntegratedIntensity_Dapi"                     
[14] "Nuclei_PA_Cycle_DNA2NProportion"                                  
[15] "Nuclei_CP_Intensity_MedianIntensity_EdU"                          
[16] "Nuclei_PA_Gated_EdUPositiveProportion"                            
[17] "Nuclei_CP_Intensity_MedianIntensity_Fibrillarin"                  
[18] "Cytoplasm_CP_Intensity_MedianIntensity_KRT19"                     
[19] "Cytoplasm_PA_Gated_KRT19PositiveProportion"                       
[20] "Cytoplasm_CP_Intensity_MedianIntensity_KRT5"                      
[21] "Cytoplasm_PA_Gated_KRT5PositiveProportion"                        
[22] "Cytoplasm_CP_Intensity_MedianIntensity_MitoTracker"               
[23] "Cytoplasm_CP_Intensity_MedianIntensity_CellMask"                  
[24] "Cytoplasm_CP_Intensity_MedianIntensity_Actin"                     
[25] "Nuclei_CP_AreaShape_Area"                                         
[26] "Nuclei_CP_AreaShape_Perimeter"                                    
[27] "Nuclei_PA_Cycle_DNA2NProportionLogitRUVLoessBacktransformed"      
[28] "Nuclei_CP_Intensity_MedianIntensity_FibrillarinLog2RUVLoess"      
[29] "Cytoplasm_CP_Intensity_MedianIntensity_CellMaskLog2RUVLoess"      
[30] "Cytoplasm_CP_Intensity_MedianIntensity_ActinLog2RUVLoess"         
[31] "Nuclei_PA_Gated_EdUPositiveProportionLogitRUVLoessBacktransformed"
[32] "Spot_PA_SpotCellCountNorm"                                        
[33] "Nuclei_CP_Intensity_MedianIntensity_DapiNorm"                     
[34] "Nuclei_CP_Intensity_IntegratedIntensity_DapiNorm"                 
[35] "Nuclei_PA_Cycle_DNA2NProportionLogitNorm"                         
[36] "Nuclei_CP_Intensity_MedianIntensity_EdUNorm"                      
[37] "Nuclei_CP_Intensity_MedianIntensity_FibrillarinNorm"              
[38] "Cytoplasm_CP_Intensity_MedianIntensity_KRT19Norm"                 
[39] "Cytoplasm_CP_Intensity_MedianIntensity_KRT5Norm"                  
[40] "Cytoplasm_CP_Intensity_MedianIntensity_MitoTrackerNorm"           
[41] "Cytoplasm_CP_Intensity_MedianIntensity_CellMaskNorm"              
[42] "Cytoplasm_CP_Intensity_MedianIntensity_ActinNorm"                 
[43] "Nuclei_CP_AreaShape_AreaNorm"                                     
[44] "Nuclei_CP_AreaShape_PerimeterNorm"                                

These features do match:

 [1] "Well"                                                          
 [2] "MEP"                                                           
 [3] "Ligand"                                                        
 [4] "ECMp"                                                          
 [5] "Spot_PA_SpotCellCountLog2RUVLoess"                             
 [6] "Nuclei_CP_Intensity_MedianIntensity_DapiLog2RUVLoess"          
 [7] "Nuclei_CP_Intensity_IntegratedIntensity_DapiLog2RUVLoess"      
 [8] "Nuclei_CP_Intensity_MedianIntensity_EdULog2RUVLoess"           
 [9] "Cytoplasm_CP_Intensity_MedianIntensity_KRT19Log2RUVLoess"      
[10] "Cytoplasm_CP_Intensity_MedianIntensity_KRT5Log2RUVLoess"       
[11] "Cytoplasm_CP_Intensity_MedianIntensity_MitoTrackerLog2RUVLoess"
[12] "Nuclei_CP_AreaShape_AreaLog2RUVLoess"                          
[13] "Nuclei_CP_AreaShape_PerimeterLog2RUVLoess"                     
markdane commented 7 years ago

This file is a subset of feature names from different non-overlapping datasets so only subset of these feature names will be in any particular dataset. However, the matching features are from the November 2016 release and I'm hoping you will point this tool at the May 2017 release. If cell line continues to not match, then there's a different problem. Any feature names that are in this file AND a data file are metadata or data appropriate to display to end users.

kdaily commented 7 years ago

Yes - I'm using the May release data. This is different behavior from the existing data - all column names provided in the curated feature list were column names in November 2016 level 4 files.