IBM / fold2seq

Code for Fold2Seq paper from ICML 2021
Apache License 2.0
49 stars 8 forks source link

Regarding to Data Source and Data Structure #7

Open chq1155 opened 2 years ago

chq1155 commented 2 years ago

Hi, I am facing a data source problem. I would like to apply my own data to your amazing model, but I cannot try to make the wrong data structure that fits your model' expected input.

Can I be provided the right data structure or just be shown the file "../data/domain_dict_full.pkl" to figure this problem out?

By the way, there is a bug on the file fold_feat_gen.py lines 42 and 48: variables 'start' and 'end' should be strings so as to fit function 'replace'. Similarly, the same file lines 83, 87, 89, and 91: information extraction of nested dictionary cannot be simply implemented by indexing as ss['seq'].

Looking forward to your precious reply. Many thanks!!

raiyan3 commented 2 years ago

Hi, If you've gone over the datasets you'll find there are four sets four sets of data: train, val, id, od. Here, I am sharing with you the domain_data.pkl file constructed with the OD dataset. https://drive.google.com/file/d/12vRaK6JevY7Rt4MdB2DUKb1JPmh-bFuv/view?usp=sharing

Hope this helps.