FangyunWei / SLRT

236 stars 46 forks source link

The provided CSL-Daily text is different from the original dataset. #33

Closed BinDaZi closed 9 months ago

BinDaZi commented 10 months ago

Hello, Firstly, I'd like to express my gratitude for your contribution. While comparing the original CSL-Daily dataset with the data you provided under 'SLRT/TwoStreamNetwork/data/csl-daily', I noticed some discrepancies in the sentences. For instance, for the data S000794_P0000_T00: The label you provided is: 'text'' : 冰箱里有饮料面包。 However, in the original dataset, it's: 冰箱里有饮料面包。 Could you please clarify the reason for this modification? Thank you for your time and assistance.

ChenYutongTHU commented 10 months ago

Hi. Thanks for your issue. We didn't modify any of the data that we received from the dataset creators. I guess there might be some update in their release. The discrepancies won't affect the performance a lot, at least in the example you provide. Please let me know if there is some difference that might be untrivial.