Closed yogeshhk closed 3 years ago
Thank you for giving us feedback. We will check this issue asap. @xguo7 will follow up it.
Could you please check whether the raw data exists in your computer (Please refer to https://github.com/graph4ai/graph4nlp/tree/master/examples/pytorch/name_entity_recognition/conll/raw)? Currently, the download function is not implemented, and the raw data should be downloaded in the repo. (The download function will be implemented in the future version.) We are sorry for the inconvenience.
I see 3 files there: eng.train eng.testa eng.testb
Files and content look fine with IOB data
We have conducted several tests on different computers with Windows 10 system and can't reproduce this problem. May I ask under what path did you execute this command?
At root of graph4nlp folder...which has been forked-cloned...from that path, the Text classifier examples work..Here is the call stack
(graph4nlp) graph4nlp>python examples/pytorch/name_entity_recognition/main.py --graph_type dependency_graph --gpu 0 --init_hidden_size 400 --hidden_size 128 --lr 0.01 --batch_size 100 --gnn_type graphsage --direction_option undirected
Using backend: pytorch
starting build the dataset
Traceback (most recent call last):
File "examples/pytorch/name_entity_recognition/main.py", line 547, in <module>
runner = Conll()
File "examples/pytorch/name_entity_recognition/main.py", line 319, in __init__
self._build_dataloader()
File "examples/pytorch/name_entity_recognition/main.py", line 342, in _build_dataloader
tag_types=self.tag_types)
File "C:\Users\yogesh.kulkarni\AppData\Local\Continuum\anaconda3\envs\graph4nlp\lib\typing.py", line 1231, in __new__
return _generic_new(cls.__next_in_mro__, cls, *args, **kwds)
File "C:\Users\yogesh.kulkarni\AppData\Local\Continuum\anaconda3\envs\graph4nlp\lib\typing.py", line 1186, in _generic_new
return base_cls.__new__(cls)
TypeError: Can't instantiate abstract class ConllDataset with abstract methods download
This looks weird.
Could you please add the following code
import os
print("The raw data's path is", self.raw_dir)
print(os.path.exists(self.raw_dir))
after https://github.com/graph4ai/graph4nlp/blob/9e1e3b5b83362ab4d8f14b06f7d8dcccc4662cc6/graph4nlp/pytorch/data/dataset.py#L393
and see whether the raw data exists?
Its not hitting there....Let me debug further and I will keep you posted
in conll.py
def download(self):
print("The raw data's path is", self.raw_dir)
print(os.path.exists(self.raw_dir))
# raise NotImplementedError(
# 'This dataset is now under test and cannot be downloaded. Please prepare the raw data yourself.')
Made it to work...but still I am not sure if this is good change...I will debug this more.
in conll.py
def download(self): print("The raw data's path is", self.raw_dir) print(os.path.exists(self.raw_dir)) # raise NotImplementedError( # 'This dataset is now under test and cannot be downloaded. Please prepare the raw data yourself.')
Made it to work...but still I am not sure if this is good change...I will debug this more.
Actually, this function will not be executed. So I guess there must be some fault. Since we can't reproduce this problem, I suggest you debug it more. Thank you!
To make it clearer, when instantiating a Dataset (in this case the ConllDataset), the library will check if the raw data are present in the environment, in this case the raw
directory and the contents in it, which is specified in the raw_file_names
property. If the raw data is not present, the download
method will be called to download the raw data. In this case the download
method is not implemented by ConllDataset
, which means the raw data must be present as the GitHub repo does. Otherwise, the NotImplementedError
is raised owing to an abstract method call.
I will close this issue.
@AlanSwift @SaizhuoWang @yogeshhk I am having this exact issue. I raised a new issue before I saw this one. Could you please help?
🐛 Bug
To Reproduce
Steps to reproduce the behavior:
Run
python examples/pytorch/name_entity_recognition/main.py --graph_type dependency_graph --gpu 0 --init_hidden_size 400 --hidden_size 128 --lr 0.01 --batch_size 100 --gnn_type graphsage --direction_option undirected
Getting
TypeError: Can't instantiate abstract class ConllDataset with abstract methods download
Expected behavior
Environment
pip
, source): sourceAdditional context