castorini / castor

PyTorch deep learning models for text processing
http://castor.ai/
Apache License 2.0
178 stars 58 forks source link

Dataset path mismatch #143

Open liudonglei opened 6 years ago

liudonglei commented 6 years ago

so far, 2018-08-18. the data path using in the Castor/sm_cnn/create_dataset.sh such as ''../../Castor-data/TrecQA'' is NOT match with the real path in Castor-data dir.

can you please check it?

Victor0118 commented 5 years ago

@liudonglei You are right. Current SMCNN code needs refactoring. See https://github.com/castorini/Castor/issues/128. Welcome to create your PR to contribute!

liudonglei commented 5 years ago

@Victor0118 Dataset path mismatch 的问题可以手动地将Castor-data/datasets下的trecqa和wikiqa目录拷贝到上层目录解决。 但是我陷入了这个问题,详见 https://github.com/castorini/Castor/issues/142 由于我对torchtext包一窍不通,琢磨不出这个< pad >标记是怎么来的、如何修正? 请作者们看一下这个问题。 执行命令为:$ python train.py --mode static --nocuda 报错信息为: File "train.py", line 62, in postprocessing=data.Pipeline(lambda arr, , train: [float(y) for y in arr])) File "train.py", line 62, in postprocessing=data.Pipeline(lambda arr, _, train: [float(y) for y in arr])) ValueError: could not convert string to float: ' < pad > '