Open moqri opened 3 months ago
Two of the samples in this dataset are loaded incorrectly:
It seems the issue is with their mis-formatting in GEO.
This is my code to correct for these:
meta1=pd.read_table(<path>,nrows=10**2,skiprows=38,index_col=0).iloc[13].str.strip('cell type: ').drop(['GSM2998097','GSM2998106']) meta2=pd.read_table(<path>,nrows=10**2,skiprows=38,index_col=0)[['GSM2998097','GSM2998106']].iloc[14].str.strip('cell type: ') meta=pd.concat([meta1,meta2]) dnam=pd.read_table(mat,nrows=10**6,skiprows=38+59,index_col=0) dnam=dnam.drop('!series_matrix_table_end') (edited)
Interesting. Perhaps we need an option for some kind of post load corrections to be added.
Two of the samples in this dataset are loaded incorrectly:
It seems the issue is with their mis-formatting in GEO.
This is my code to correct for these: