Closed WeianMao closed 8 months ago
We don't do filtering by time because there is no notion of train/valid/test in unsupervised learning. The objective is to learn the data distribution. You're right PDB is constantly being updated so one should re-download from time to time.
As we all know, the data in PDB is continuously being updated. However, I noticed that in your data processing scripts, there are no operations to threshold the samples by time. May I ask how do you align the samples used in your paper? Because as time goes by, the samples I use will also increase.