iLampard / lamp

PyTorch implementation of the paper "Language Models Can Improve Event Prediction by Few-Shot Abductive Reasoning", NeurIPS 2023
Apache License 2.0
47 stars 6 forks source link

Why overlap=False in trainset and True in valid and test set? #4

Closed otakusbear closed 6 months ago

otakusbear commented 6 months ago

in kg_dataset_factory.py line 35 self.train_dataset = KGDataset( data=self.data, context_length=self.context_length, time_factor=self.time_factor, end_ratio=self.train_end_index_ratio, overlap=False ) Why overlap=False in trainset and True in valid and test set? on GDELT dataset, It will be 854 for train_set size and 10684 for valid and test size. I haven't been in this field for long, so I don't quite understand the reason for doing this. It seems to be a bit out of line with the conventions of other tasks. Could you explain it for me? Thanks very much.

iLampard commented 6 months ago

Hi, overlap=False indicates the sliding windows have no overlap over time. It is simply to accelerate the training process by using less data.