YanCote / IFT6268-simclr

Project for IFT6268
0 stars 0 forks source link

Project IFT6268 - Exploration Path on SimCLRv2

Exploration Path on SimCLRv2 - Big Self-Supervised Models are Strong Semi-Supervised Learners

SimCLR Illustration
An illustration of SimCLR (from our blog here).

[Original Google SimCLR Git Repo](https://github.com/google-research/simclr)
## Project description Project looks at low data and compute regime as well as how it generalize well on other dataset. ### Methodology .... ## Environment setup Conda ENV - conda create --name simclr python=3.7 - pip install -r requirement.txt Download XRAY(2.5GB): Download Google Models: - install google sdk https://cloud.google.com/sdk/docs/downloads-versioned-archives - gsutil cp -r 'gs://simclr-checkpoints/simclrv2/finetuned_100pct/r50_1x_sk0/hub/'. Make sure checkpoint are named correctly if there are import errors - renamed saved_model.pb tfhub_model.pb and for variables Both graham and cedar are used in the project ## Pre-Training on XRAY Pre-Training is achieved using run.py. Locally, There's a template for parameter un launch_template.json which could be use with VSCode. *sbatch run.sh username* is used to launch the script on compute node. Every Run generate a Monolithic output such as archived and named using datetime: - One or several Checkpoint - A final HUB file - FLAGs(active arguments) in a text and pickle file - TensorBoard Files. - Run log in a human readable format *.txt >In: XRAY Dataset
>Out: XRAY PreTrain Monolithic output *scripts/down_pretrain_models.sh username*: download pretrained models locally *scripts/sync_scratch.sh*: Use on Compute Canada to sync home with Scratch *scripts/initial_down_whl.sh*: script to download the whl file for packages not available on CC ## FineTuning and validation Must run Finetuning/finetuning.py and use config.yml as a template. >In: XRAY PreTrain Monolithic output or Google Pretrained >Out: Monolithic output Every Run generate a Monolithic output such as archived and named using datetime: - One or several Checkpoint - A final HUB - MLFLow information merge in Project Finetuning/mlruns - TensorBoard Files. - Author: Shannel Gauthier, Marc-Andre Ruel, Yan Cote