Closed sdash-github closed 7 years ago
-- Added 'key' column to spread sheet. -- Added 2nd row in data file with just SRR number(key). Remove first row if you have to.
** I suppose it is now ready for loading.
Done and Connor used it to load expression data.
Sample uniquenames have to match column headers in expression data file for loading. Data processing steps use the srr_run acc no for column headers (samples) because the SRA data files use this no as file names. So far, for loading, I was almost manually replacing the column headers in data files with sample uniquenames to match biomaterials related tables in Chado. Discussion with Connor: Add an extra 'key' column to the 'Samples' sheet that has these SRR numbers (duplication of 'sra_run' column as a 'key' col). This column would be referred to during loading to look up the correspondence between sample uniquename and its data column. Future proof: In case a dataset is not from SRA, we can use keys of our choice other than sra_run acc.