Question regards to validation

piEsposito / blitz-bayesian-deep-learning

A simple and extensible library to create Bayesian Neural Network layers on PyTorch.

GNU General Public License v3.0

918 stars 107 forks source link

Question regards to validation #75

Closed Erickurashi closed 3 years ago

Erickurashi commented 3 years ago

Hi Piero, thank you very much for your post on bayesian LSTM. Is there any validation in this code? Thank you.

TerenceLiu98 commented 3 years ago

Do you mean the [training, validation, testing] structure? I did not find the testing either. Just split the dataset again and you can follow the standard procedure.

Erickurashi commented 3 years ago

Do you mean the [training, validation, testing] structure? I did not find the testing either. Just split the dataset again and you can follow the standard procedure.

Hi Terence, thank you for your answer, do you know what type of the validation is this code using, is it a walk forward validation? Sorry, I am new to coding, how do I split the dataset again and follow the standard procedure. Thank you

TerenceLiu98 commented 3 years ago

@Erickurashi Hi, as you said, you are new in coding, you can first read scikit-learn (https://scikit-learn.org/), many coders use this package to do the preprocessing: for example, for splitting the dataset, you can simply use train_test_split() function from scikit-learn.

Erickurashi commented 3 years ago

Screenshot 2021-03-12 142128 @TerenceLiu98 Thank you for link. I am confused why the y_train data appear in X_test data in original code (as shown in image). Should data used for train and test completely separate? Thank you.

TerenceLiu98 commented 3 years ago

X is the feature and y is the response variable, you need to figure out what you want to input.

Erickurashi commented 3 years ago

X is the feature and y is the response variable, you need to figure out what you want to input.

I understand X is the feature, y is the response. X_test data is used to predict y_test, X_train data is used to predict y_train, but I don't understand why is y_train data has X_test data? Test dataset should be new set of data.

TerenceLiu98 commented 3 years ago

X is the feature and y is the response variable, you need to figure out what you want to input.

I understand X is the feature, y is the response. X_test data is used to predict y_test, X_train data is used to predict y_train, but I don't understand why is y_train data has X_test data? Test dataset should be new set of data.

Usually testing train should be different, maybe you need to check your code and the data shape

Erickurashi commented 3 years ago

X is the feature and y is the response variable, you need to figure out what you want to input.

I understand X is the feature, y is the response. X_test data is used to predict y_test, X_train data is used to predict y_train, but I don't understand why is y_train data has X_test data? Test dataset should be new set of data.

Usually testing train should be different, maybe you need to check your code and the data shape

Hi, I have also used author's original code, If you display X_test and y_train, there are some same data between these 2 sets.

piEsposito commented 3 years ago

Closing due to staleness.