BackofenLab / Cherri

https://backofenlab.github.io/Cherri/
GNU General Public License v3.0
0 stars 0 forks source link

Trainings data and retraining of the Models #36

Open teresa-m opened 2 years ago

teresa-m commented 2 years ago

We should have the definite feature cvs. trainings data stored somewhere for the following datasets:

  1. Paris human only RRIs used for occupied regions (paris_human_RRI)
  2. Paris human RRIs and RBP bindings sides used for occupied regions (paris_human_RBPs)
  3. Paris mouse only RRIs used for occupied regions (paris_mouse_RRI)
  4. Splash human only RRIs used for occupied regions (splash_human_RRI)
  5. Paris + SPlash human only RRIs used for occupied regions (paris_splash_human_RRI)
  6. paris_human_RRI + paris_human_RBPs + paris_splash_human_RRI (full_human)
  7. paris_human_RRI + paris_human_RBPs + paris_splash_human_RRI + splash_human_RRI (full)

For this data we should have the models with the following data combination:

  1. paris_human
  2. paris_human_rbp
  3. model: paris_mouse
  4. splash_human
  5. full_human_rri
  6. full_human
  7. full
teresa-m commented 2 years ago

TODO

teresa-m commented 2 years ago

calls:

paris_human_RRI

python get_features.py -i /vol/scratch/data/pos_neg_data_context/full_noRBPs/paris_HEK293T_context_150_pos_occ_pos.csv -f all -o /vol/scratch/data/features_files/all_feature_files/paris_human_RRI_pos.csv

python get_features.py -i /vol/scratch/data/pos_neg_data_context/full_noRBPs/paris_HEK293T_context_150_pos_occ_neg.csv -f all -o /vol/scratch/data/features_files/all_feature_files/paris_human_RRI_neg.csv

paris_human_RBPs

python get_features.py -i /vol/scratch/data/pos_neg_data_context/full_c150_shortRBPs/paris_HEK_context_150_pos_occ_pos.csv -f all -o /vol/scratch/data/features_files/all_feature_files/paris_human_RBPs_pos.csv

python get_features.py -i /vol/scratch/data/pos_neg_data_context/full_c150_shortRBPs/paris_HEK_context_150_pos_occ_neg.csv -f all -o /vol/scratch/data/features_files/all_feature_files/paris_human_RBPs_neg.csv

paris_mouse_RRI

python get_features.py -i /vol/scratch/data/pos_neg_data_context/mouse_full_c150/paris_mouse_context_150_pos_occ_pos.csv -f all -o /vol/scratch/data/features_files/all_feature_files/paris_mouse_RRI_pos.csv

python get_features.py -i /vol/scratch/data/pos_neg_data_context/mouse_full_c150/paris_mouse_context_150_pos_occ_neg.csv -f all -o /vol/scratch/data/features_files/all_feature_files/paris_mouse_RRI_neg.csv

splash_human_RRI

python get_features.py -i /vol/scratch/data/pos_neg_data_context/full_SPLASH_hES/SPLASH_hES_context_150_pos_occ_pos.csv -f all -o /vol/scratch/data/features_files/all_feature_files/splash_human_RRI_pos.csv

python get_features.py -i /vol/scratch/data/pos_neg_data_context/full_SPLASH_hES/SPLASH_hES_context_150_pos_occ_neg.csv -f all -o /vol/scratch/data/features_files/all_feature_files/splash_human_RRI_neg.csv

paris_splash_human_RRI

awk 'FNR==1 && NR!=1{next;}{print}' /vol/scratch/data/features_files/all_feature_files/paris_human_RRI_pos.csv /vol/scratch/data/features_files/all_feature_files/splash_human_RRI_pos.csv > /vol/scratch/data/features_files/all_feature_files/paris_splash_human_RRI_pos.csv

awk 'FNR==1 && NR!=1{next;}{print}' /vol/scratch/data/features_files/all_feature_files/paris_human_RRI_neg.csv /vol/scratch/data/features_files/all_feature_files/splash_human_RRI_neg.csv > /vol/scratch/data/features_files/all_feature_files/paris_splash_human_RRI_neg.csv

full_human

awk 'FNR==1 && NR!=1{next;}{print}' /vol/scratch/data/features_files/all_feature_files/paris_splash_human_RRI_pos.csv /vol/scratch/data/features_files/all_feature_files/paris_human_RBPs_pos.csv > /vol/scratch/data/features_files/all_feature_files/full_human_pos.csv

awk 'FNR==1 && NR!=1{next;}{print}' /vol/scratch/data/features_files/all_feature_files/paris_splash_human_RRI_neg.csv /vol/scratch/data/features_files/all_feature_files/paris_human_RBPs_neg.csv > /vol/scratch/data/features_files/all_feature_files/full_human_neg.csv

full

awk 'FNR==1 && NR!=1{next;}{print}' /vol/scratch/data/features_files/all_feature_files/full_human_pos.csv /vol/scratch/data/features_files/all_feature_files/paris_mouse_RRI_pos.csv > /vol/scratch/data/features_files/all_feature_files/full_pos.csv

awk 'FNR==1 && NR!=1{next;}{print}' /vol/scratch/data/features_files/all_feature_files/full_human_neg.csv /vol/scratch/data/features_files/all_feature_files/paris_mouse_RRI_neg.csv > /vol/scratch/data/features_files/all_feature_files/full_neg.csv