Dv04 / Vision_Transformer

A project based on Vision Transformer for Image Regression
MIT License
14 stars 3 forks source link

Implement Robust Dataset Splitting Mechanism #2

Closed Dv04 closed 9 months ago

Dv04 commented 9 months ago

The project requires a robust mechanism for splitting the dataset into training, validation, and testing sets. The following points should be addressed:

Balanced Distribution: Ensure a balanced distribution of data among these sets to prevent any bias during model training and evaluation. Stratification: If applicable, stratify the split to maintain the distribution of certain variables. Randomization: Include randomization to ensure different data points are used across multiple runs, enhancing the model's robustness. Configurability: Allow for easy configurability of the split ratios and seed for reproducibility. This mechanism should be encapsulated in a function, making the dataset preparation phase clean and reproducible.

darshbaxi commented 9 months ago

@Dv04 can you assign me this issue :)

Dv04 commented 9 months ago

Hey @darshbaxi, glad to have you with us. Assigning you now.

darshbaxi commented 9 months ago

image is data to be passed as a parameter? what exactly is the job of folders variable ??

Dv04 commented 9 months ago

is data to be passed as a parameter? what exactly is the job of folders variable ??

Data has to be passed in the folders variable as follows:

E.g.

folders = glob.glob("data/*")

which will give the folders this array:

['data/Image_Folder1', 'data/Image_Folder2', 'Image_Folder3', 'data/Image_Folder4', 'data/Image_Folder5', 'data/Image_Folder6']

This is only in case you have more than one folders in dataset. other wise you can directly give the folders variable the path to the main Image directory.

Hope there are no more issues. Ask away if there exists.

darshbaxi commented 9 months ago

@Dv04 I have created a Pull Request. Please have a look at it

Dv04 commented 9 months ago

@Dv04 I have created a Pull Request. Please have a look at it

@darshbaxi, will give you the reply before tomorrow, just have to finish up some bits of work.