arturjordao / WearableSensorData

This repository provides the codes and data used in our paper "Human Activity Recognition Based on Wearable Sensor Data: A Standardization of the State-of-the-Art", where we implement and evaluate several state-of-the-art approaches, ranging from handcrafted-based methods to convolutional neural networks.
97 stars 26 forks source link

Hyperparameter problem #2

Closed mylovecc2020 closed 3 years ago

mylovecc2020 commented 3 years ago

Hello, thanks for the paper and the codes that I got a lot of inspiration, and implement in Pytorch. But when I run the code which download in github in my own environment, the result is not the same as the paper.For example I turndown the batchsize the result is much better than a big one, and so on. Do I have to turn the hyperparameters for ervery paper‘s code?How can I quick repetition of the results in the paper, thanks a lot!

arturjordao commented 3 years ago

Dear @mylovecc2020,

Do I have to turn the hyperparameters for every paper‘s code? No. Once we have set the hyperparameters (batch size, epochs, learning rate, optimizer, etc.) we use them on all datasets and methods. Unfortunately, due to computational constraints, we were not able to try many combinations of hyperparameters. Thus, different values to batch size, learning and so on can lead to better results (as you mentioned). Finally, I would like to mention that the same convolutional architecture can present different results when running on TensorFlow/Keras or PyTorch (see this link).

How can I quick repetition of the results in the paper? For this purpose, I recommend you use the source code in the repository since they are with the experimental setup (i.e., hyperparameters) used in the paper. Alternatively, you can check the hyperparameters used in our paper in this file (i.e., cm.loss, cm.bs, cm.n_ep) and use them on your PyTorch implementation.

If you have any other questions, please let me know.

Best regards, Artur Jordão.

Mohammadtvk commented 3 years ago

hi @arturjordao

I am struggling with results and their reproducibility, for example you mentioned in your paper that mean accuracy(Semi non overlapping leave one subject out) of ChenXue model in WISDM dataset is 83.89%. but i got 70.58% running your code. and around 25% running my own code. WISDM Dataset has 51 subjects (link) while your data is for 36 subjects. and also different sensors are not synced so we can use only one that will result in (batch_Size, channel, time, 3) shape and yours is (batch_size, channel, time, 6)

@mylovecc2020 can i have your pytorch implementation of these models?

Tnx alot!

arturjordao commented 3 years ago

Hi @Mohammadtvk ,

I have no idea the reason for these poor results because other papers have employed these implementations and achieved stable results.

Regarding the WISDM dataset, since some subjects do not perform all activities, we cannot consider all subjects in our evaluation. I didn't understand the synchronization issue. Indeed, we do not synchronize the sensors because the raw data provided by the dataset is already synchronized (i.e., all samples have the same number of columns - sensors).

If your problems persist, you can wait for other implementations or implement this benchmark by yourself, as all codes are simple and easy to follow. Good Lucky.

Mohammadtvk commented 3 years ago

Thanks for the reply @arturjordao .

I did not change anything in your implementation and just run it. I got results which are inconsistent and this not just WISDM dataset. USCHAD dataset has the same issue. But in handcrafted models results are consistent. I don't know what's going on!

And about WISDM dataset, which subject are removed from dataset? in the dataset that i have there are 4 files: accel_phone, accel_watch, gyro_phone and gyro_watch. in each file there is a timestamp which is not equal in different sensor modalities.

arturjordao commented 3 years ago

@Mohammadtvk,

What is the keras+tf version you are using?

Regarding the WISDM dataset, I believe we are using different versions. I just checked the data using this link and there were some modifications, e.g., when I convert the files there were no "phone" and "watch". In addition, the files were created around July 2019 and I converted the WISDM files around 2017 (the first version of the paper). Unfortunately, I have no more the raw data I have used. I will check if the other authors have these files and answer you as soon as possible.

Best regards, Artur Jordão.

arturjordao commented 3 years ago

@Mohammadtvk,

What is the keras+tf version you are using?

Regarding the WISDM dataset, I believe we are using different versions. I just checked the data using this link and there were some modifications, e.g., when I convert the files there were no "phone" and "watch". In addition, the files were created around July 2019 and I converted the WISDM files around 2017 (the first version of the paper). Unfortunately, I have no more the raw data I have used. I will check if the other authors have these files and answer you as soon as possible.

Best regards, Artur Jordão.

arturjordao commented 3 years ago

@Mohammadtvk

I found the raw data I have used at this link. Also, I have attached the converted raw data, where you can check the number of subjects and other information. converted_data.zip