ainazHjm / Baselines

This repo contains the baseline models for image data (hdf5 format) and a simple linear model for sea2sky data (csv format)
0 stars 0 forks source link

Baselines

This repo contains the baseline models for image data in hdf5 format and a simple linear model for sea2sky data in csv format to predict landslides.

Installation

Assuming that you have python3 and pip3 previously installed, use the following command to install the required packages:

pip install -r requirements.txt

Compile and Run

Clone the sea2sky branch into a folder. This folder should have access to the data folder so don't put it in a write protected place. Get the corresponding data containing the csv files and previously created dataset (.h5) and put them in a separate folder as the code.

To train a new model, use this command in the code's directory with the appropriate arguments:

python3 -m main [args]

If no argument is specified, it will use the default arguments. The following command is an example:

python3 -m main --data_path '/home/ainaz/Projects/Landslides/Baselines/code/sea2sky.h5' --threshold 0.9 --model LinearLayer --weight 5 --lr 0.01 --n_epochs 5 --batch_size 50 --num_workers 4 --sea2sky True --feature_num 136 --save_model_to '../models/Linear_threshold90_weighted/' --decay '0.001' --s 1

Here are the possible arguments with their use cases:

There are some helper functions for data cleaning and merging multiple csv files (outputs from the matcher) in load_csv.py that can be used. This file combines multiple csv files and creates a data table containing all relevant features with their target ids. This output is used to create the dataset (hdf5). There's also a function to replace no-data values with the most common feature. This function can be used when creating the dataset by setting imputation=True.