zhoulab / p53-chip-seq-data

Basic machine learning on genomic data
0 stars 0 forks source link

Melt current master table format #8

Closed victorlin closed 7 years ago

victorlin commented 7 years ago

Example:

chr     start   end     sample_name             MACS_score  P53match_score  FE  annotation  gene    repeat_count    peak_length repeat_proportion
chr2L   5763    6010    R_w1118_P53_NT60A_SKc   14.55       ?               ?   Satellite   CG11023 155             247         0.627530364
chr2L   5763    6010    R_w1118_P53_XR60A_SKc   19.1        ?               ?   Satellite   CG11023 155             247         0.627530364
varsh1090 commented 7 years ago

Sample name is of the format - Cell_type_P53_TreatmentTimeRepeat Eg. R_w1118_P53_XR60A_SKc Cell_type - R_w1118 or S_w1118 or Kc167 (whatever is before _P53) Treatment - NT or XR Time - 30, 60 or 120 Repeat - A, B or C

varsh1090 commented 7 years ago

The sample names starting with R also have an extra _SKc at the end, you can ignore those while making the individual columns (Since it does not fall under any of the categories - Cell_type_P53_TreatmentTimeRepeat) and keep it only in the sample name column.

victorlin commented 7 years ago

Updted master table file: /ufrc/zhou/share/projects/bioinformatics/p53-chip-seq-plots/results/ChIP_master_table.txt