NTMC-Community / MatchZoo

Facilitating the design, comparison and sharing of deep text matching models.
Apache License 2.0
3.82k stars 898 forks source link

pip does not copy data files into '/usr/local/lib/python3/site-packages/matchzoo/datasets #781

Open borhan-kazimipour opened 4 years ago

borhan-kazimipour commented 4 years ago

Describe the bug

pip install amtchzoo does not copy data files such as train.csv into '/usr/local/lib/python3/site-packages/matchzoo/datasets/toy`.

To Reproduce

Do the following to install matchzoo

pip install tensorflow pip install matchzoo

Then in Python: import matchzoo as mz mz.datasets.toy.load_data(stage='train', task=mz.tasks.Ranking())

This produce:

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-13-d1f2050d872f> in <module>
----> 1 mz.datasets.toy.load_data(stage='train', task=mz.tasks.Ranking())

/usr/local/lib/python3.7/site-packages/matchzoo/datasets/toy/__init__.py in load_data(stage, task, return_classes)
     42 
     43     path = Path(__file__).parent.joinpath(f'{stage}.csv')
---> 44     data_pack = matchzoo.pack(pd.read_csv(path, index_col=0))
     45 
     46     if isinstance(task, matchzoo.tasks.Ranking):

/usr/local/lib/python3.7/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision)
    683         )
    684 
--> 685         return _read(filepath_or_buffer, kwds)
    686 
    687     parser_f.__name__ = name

/usr/local/lib/python3.7/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds)
    455 
    456     # Create the parser.
--> 457     parser = TextFileReader(fp_or_buf, **kwds)
    458 
    459     if chunksize or iterator:

/usr/local/lib/python3.7/site-packages/pandas/io/parsers.py in __init__(self, f, engine, **kwds)
    893             self.options["has_index_names"] = kwds["has_index_names"]
    894 
--> 895         self._make_engine(self.engine)
    896 
    897     def close(self):

/usr/local/lib/python3.7/site-packages/pandas/io/parsers.py in _make_engine(self, engine)
   1133     def _make_engine(self, engine="c"):
   1134         if engine == "c":
-> 1135             self._engine = CParserWrapper(self.f, **self.options)
   1136         else:
   1137             if engine == "python":

/usr/local/lib/python3.7/site-packages/pandas/io/parsers.py in __init__(self, src, **kwds)
   1904         kwds["usecols"] = self.usecols
   1905 
-> 1906         self._reader = parsers.TextReader(src, **kwds)
   1907         self.unnamed_cols = self._reader.unnamed_cols
   1908 

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.__cinit__()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._setup_parser_source()

FileNotFoundError: [Errno 2] File b'/usr/local/lib/python3.7/site-packages/matchzoo/datasets/toy/train.csv' does not exist: b'/usr/local/lib/python3.7/site-packages/matchzoo/datasets/toy/train.csv'

This mz.datasets.list_available() results in ['snli', 'wiki_qa', 'toy', 'quora_qp', 'embeddings'] I can see the folders and .py files are created in /usr/local/lib/python3.7/site-packages/matchzoo/datasets/toy/ but there is no CSV file.

Context

uduse commented 4 years ago

It sounds like you tried the github version and it worked fine.

@faneshion This might be a packaging issue?

borhan-kazimipour commented 4 years ago

It sounds like you tried the github version and it worked fine.

@faneshion This might be a packaging issue?

I guess it's a packaging issue. If I copy data files from GitHub to where MatchZoo expect them to be, it works fine.

faneshion commented 4 years ago

Thanks @borhan-kazimipour , I will check it~

matthew-z commented 4 years ago

Same problem here. There is no dataset in toy after installing with pypi