Closed h-bouzid closed 1 year ago
Hi @BOUZID-Hamza,
I am not familiar with BMLrub but I do not think it is caused by the similarity of actions. Could you please provide the information about BMLrub, such as the number of classes, the number of videos and the average of frames per video?
I suppose you are using the MSRAction model. Could you please also try the NTU model?
Best
Hi @hehefan,
The dataset is composed of a total of ≈3000 videos of 18 classes. each video contains almost 1000 frames (very high FPS). But, I uniformly sampled 24 frames (same setup as a model that I am comparing it to).
And, yes, I am using the MSRAction model. I'll try the NTU model.
With Gratitude
Hi @BOUZID-Hamza ,
The specifics of the dataset look fine.
I think you may need to tune the "radius" and "nsamples" parameters.
Best
Hi @hehefan,
I had to significantly change the learning rate when training the model on other datasets.
Thank you for your help 😄.
Hi, @hehefan ,
I've read your papers "PSTNet" and "P4Transformer". Thank you for the awesome work.
I have tried training both of them on an other dataset "BMLrub" from the "Amass" archive. I transformed the data used the same approach to process and load the datafrom mesh to PC, then used the same approach to process and load the data to train both models. "P4Transformer" gives me 91-92% accuracy, but "PSTNet" gives only 35-36%.
Could it be the similarity of action classes ('walk in circle', 'normal walk', 'thridmill slow',thridmill normal',thridmill fast', ...), that makes it difficult for the convolution based model to differentiate between them? or must it be something wrong in the modifications I did ?
If there is anything I can add to clear the problem please let me know. Thank you for your time.