znhy1024 / HEARD

22 stars 4 forks source link

FileNotFoundError: [Errno 2] No such file or directory: 'data/BEARD.pkl' #1

Open drob-xx opened 2 years ago

drob-xx commented 2 years ago

I'm getting the above error from main(). It looks like 'data/BEARD.pkl' is defined in the config file but not present. The data ZIP has some JSON files. Please advise.

pid: 323
*****2022-10-28 21:06:36*****
{'active_model': 'HEARD', 'models': {'HEARD': {'early_stop_lr': 1e-05, 'early_stop_patience': 6, 'hyperparameters': {'learning_rate': {'RD': 0.0002, 'HC': 0.0002}, 'max_seq_len': 100, 'max_post_len': 300, 'batch_size': 16, 'epochs': 12, 'lstm_dropout': 0.1, 'fc_dropout': 0.3, 'beta': {'HC': 1.0, 'T': 1.0, 'N': 1.0}, 'hidden_size_HC': 64, 'hidden_size_RD': 128, 'in_feats_HC': 1, 'in_feats_RD': 1000, 'sample_integral': 100, 'sample_pred': 100, 'weight_decay': 0.0001, 'interval': 3600.0, 'decay_patience': 3, 'lstm_layers': 1}, 'evaluate_only': False, 'data': 'data/BEARD.pkl', 'data_ids': 'data/BEARD_ids.pkl', 'device': 'cuda', 'dataset': 'BEARD', 'model_dir': 'saved_models/'}}}
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
[<ipython-input-11-577c39408d20>](https://localhost:8080/#) in <module>
----> 1 execfile('./Main.py')

5 frames
[/usr/local/lib/python3.7/dist-packages/debugpy/_vendored/pydevd/_pydev_imps/_pydev_execfile.py](https://localhost:8080/#) in execfile(file, glob, loc)
     23 
     24     #execute the script (note: it's important to compile first to have the filename set in debug mode)
---> 25     exec(compile(contents+"\n", file, 'exec'), glob, loc)

[/content/HEARD/Main.py](https://localhost:8080/#) in <module>
     40 
     41 if __name__ == '__main__':
---> 42     main()
     43     print(f'[+]Done: '+time.strftime("%Y-%m-%d %H:%M:%S",time.localtime()))

[/content/HEARD/Main.py](https://localhost:8080/#) in main()
     17 
     18     results = {}
---> 19     handle = Train(Config)
     20     val_loader = handle.val_loader
     21 

[/content/HEARD/Train.py](https://localhost:8080/#) in __init__(self, config)
     66         self.interval = config["models"][config["active_model"]]["hyperparameters"]["interval"]
     67 
---> 68         self.val_len,self.val_loader,self.folds_loader = get_dataloader(config)
     69 
     70 

[/content/HEARD/Dataset.py](https://localhost:8080/#) in get_dataloader(config)
    161 def get_dataloader(config):
    162 
--> 163     handle = HDataLoader(config)
    164     val_len, val_loader,folds_loader = handle.get_loaders()
    165     return val_len, val_loader,folds_loader

[/content/HEARD/Dataset.py](https://localhost:8080/#) in __init__(self, config)
     81         self.text_feats = config["models"][config["active_model"]]["hyperparameters"]["in_feats_RD"]
     82 
---> 83         self.data = pickle.load(open(config["models"][config["active_model"]]["data"],'rb'))
     84 
     85         self.dataids = pickle.load(open(config["models"][config["active_model"]]["data_ids"],'rb'))

FileNotFoundError: [Errno 2] No such file or directory: 'data/BEARD.pkl'
znhy1024 commented 2 years ago

Hi,

The pkl file should contain the content of tweets in BEARD dataset. However, we cannot release the specific content of tweets due to the terms of use of Twitter data. Therefore, you may construct the pkl file by following the instructions in BEARD Dataset section in README:

  1. download the tweet content via Twitter API using the tweet ids in the JSON file.
  2. prepare input data for the HEARD model and save it as pkl file.
simhadribhargava commented 1 year ago

Hi @znhy1024 ,

Can I have the example dataset or can you help me in running the dataprocess.py file

znhy1024 commented 1 year ago

Hi,

Please see if the information regarding the format of the dataset in the closed issue can be of any help to you.