UC-Irvine-CS175 / final-project-e-mcheese-2

final-project-e-mcheese-2 created by GitHub Classroom
0 stars 1 forks source link

Review Assignment Due Date

E=mCheese2


ACQUIRING DATA

In src/main.py call initial_download().
This function creates a new directory with path data/processed. It downloads all the data from AWS using boto3.

NOTE: ONLY CALL THIS FUNCTION ONCE!


APPLYING WATERSHED

In src/main.py call generate_watershed().
This function generates all of the watershed masks for a given subset of images.

Parameters:
string csv_dir: The directory that contains the target csv file. This has to include the full path to the directory.
string csv_file
: The name of the csv file contained in csv_dir with the target images.
bool file_on_prem: If true will pull source images from local disk. If false will pull images from AWS.
__string save_local__: Default is none, which will not save the images but instead will show temporary matplot images. If given a string, it should be the directory where the images will be saved to.


VERIFYING WATERSHED

To produce plots of foci counts and estimated Poisson distributions, run the main() function of src/poisson_distributions.py. In order to run this main, the following steps must be completed:

Use prepare_subset_data() to create CSV files for each subset. Parameters:
string meta_csv_path: The path to the meta csv file containing the entire dataset.
string subset_csv_dir: The path to the directory where the user wants to save the subset csv files.

Use pickle_masks() on each subset CSV to create an array of masks for each image in the CSV and save that array as a pickle file locally. Parameters:
string local_csv_path: The path to the subset csv file.
string local_data_dir: The path to the directory where raw images are stored.
string local_masks_path: The path to where the user wants the mask pickle to be saved.
Watershed watershed: The path to the directory where the user wants to save the subset csv files.
torchvision.transform transform: Transformation to apply to images before masking. Default None.

In the poisson_distributions.py main(), change the directory passed to visualize_subset() to the local mask directory (whatever was passed as subset_csv_dir to prepare_subset_data()).


TRAINING THE AUTOENCODER

In src/main.py or src/models/unsupervised/autoencoder_v1.py call train_autoencoder().
This function trains the unsupervised autoencoder model.

Parameters:
string csv_dir: The directory that contains the target csv file. This has to include the full path to the directory.
string train_file
: The name of the csv file contained in csv_dir with the images to train the model with
string val_file__: The name of the csv file contained in csv_dir with the images to validate the model with
int num_workers__: Specifies how many workers will be used for the dataloaders

NOTE: All files to be used for training/validation must be stored locally in csv_dir


USING THE AUTOENCODER

In src/main.py call generate_autoencoder_output().
This function loads a pretrained autoencoder model, and generates its output images for a given subset.

Parameters:
model_weights: Specifies the path to the trained model weights.
string csv_dir
: The directory that contains the target csv file. This has to include the full path to the directory.
string csv_file__: The name of the csv file contained in csv_dir with the target images.
string save_local__: Default is none, which will not save the images but instead will show temporary matplot images. If given a string, it should be the directory where the images will be saved to.

NOTE: There are pretrained model weights in the directory named Model_Weights


t-SNE Analysis

In src/main.py call generate_tsne_plots().
This function saves the t-SNE plots for the images in a given directory and subset.

Parameters:
string csv_dir: The directory that contains the target csv file. This has to include the full path to the directory.
string csv_file
: The name of the csv file contained in csv_dir with the target images.
__string file_name__: This specifies the titles and file names of the t-SNE plots

NOTE: Make sure the files you want to evaluate are in csv_dir, as only the files in this directory will be found.


Comparing All Model Images

In src/main.py call plot_all_images().
This function shows a side by side comparison of the raw images, watershed masks, and autoencoder output.

Parameters:
string process_dir: The directory that contains the target csv file, along with all the raw images.
string watershed_dir
: The directory that contains the watershed output images.
string autoencoder_dir: The directory that contains the autoencoder output images.
string save_local
: Default is none, which will not save the images but instead will show temporary matplot images. If given a string, it should be the directory where the images will be saved to.

NOTE: All of the directory parameters must include the full path to each respective directory.


Tips on Saving Output Images

Make sure to save all raw, Watershed, and Autoencoder Outputs into separate directories