Open jaybundy opened 5 years ago
I'm not 100% sure, but I guess this issue comes from building from source. If you pip download --no-binary :all: --no-dependencies dfply
you'll find the same issue, the diamonds.csv
file is missing from dfply/data/
folder. However downloading the wheel pip download --no-dependencies dfply
, if you inspect the wheel you'll find that the diamonds.csv file is there.
I don't know anything about conda package management, but perhaps they take the result of python setup.py sdist
, which would omit the data file. According to this random SO post, a MANIFEST file should fix things.
https://stackoverflow.com/questions/7522250/how-to-include-package-data-with-setuptools-distribute
I ran into the same error with file missing of diamonds when dfply
library was installed using conda (conda install -c tallic dfply
). In order to resolve this remove library and its dependencies using conda.
Then install using pip under same conda environment.
library installed with pip install
works
I'm having this same issue still: https://github.com/kieferk/dfply/issues/8
-I am using conda to install dfply (which I need to because that's the package manager used by the computing cluster I have access to).
conda install -c tallic dfply
That's the command I use to install the package from https://anaconda.org/tallic/dfply.
But when I go to use dfply, it still says the diamonds.csv data is missing.
Traceback (most recent call last): File "ACH_nested_anova.py", line 1, in
import dfply
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/init.py", line 11, in
from .data import diamonds
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/init.py", line 5, in
diamonds = pd.read_csv(os.path.join(root, "diamonds.csv"))
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 702, in parser_f
return _read(filepath_or_buffer, kwds)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 429, in _read
parser = TextFileReader(filepath_or_buffer, kwds)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 895, in init
self._make_engine(self.engine)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 1122, in _make_engine
self._engine = CParserWrapper(self.f, self.options)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 1853, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 387, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/diamonds.csv' does not exist: b'/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/diamonds.csv'
2019-03-15 13:25:11 ⌚ gateway-03 in ~/ACH_Development/ACH_tests/ACH_quiz3/python_scripts/Analysis ○ → python ACH_nested_anova.py Traceback (most recent call last): File "ACH_nested_anova.py", line 2, in
from dfply import group_by as group_by, summarize as summarize, select as select
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/init.py", line 11, in
from .data import diamonds
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/init.py", line 5, in
diamonds = pd.read_csv(os.path.join(root, "diamonds.csv"))
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 702, in parser_f
return _read(filepath_or_buffer, kwds)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 429, in _read
parser = TextFileReader(filepath_or_buffer, kwds)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 895, in init
self._make_engine(self.engine)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 1122, in _make_engine
self._engine = CParserWrapper(self.f, self.options)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 1853, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 387, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/diamonds.csv' does not exist: b'/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/diamonds.csv'
2019-03-15 13:25:41 ⌚ gateway-03 in ~/ACH_Development/ACH_tests/ACH_quiz3/python_scripts/Analysis ○ → pip install dfply Requirement already satisfied: dfply in /mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages (0.3.1) Requirement already satisfied: numpy in /mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages (from dfply) (1.16.2) Requirement already satisfied: pandas in /mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages (from dfply) (0.24.2) Requirement already satisfied: python-dateutil>=2.5.0 in /mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages (from pandas->dfply) (2.8.0) Requirement already satisfied: pytz>=2011k in /mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages (from pandas->dfply) (2018.9) Requirement already satisfied: six>=1.5 in /mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages (from python-dateutil>=2.5.0->pandas->dfply) (1.12.0)
2019-03-15 13:26:59 ⌚ gateway-03 in ~/ACH_Development/ACH_tests/ACH_quiz3/python_scripts/Analysis ○ → python ACH_nested_anova.py Traceback (most recent call last): File "ACH_nested_anova.py", line 2, in
from dfply import group_by as group_by, summarize as summarize, select as select
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/init.py", line 11, in
from .data import diamonds
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/init.py", line 5, in
diamonds = pd.read_csv(os.path.join(root, "diamonds.csv"))
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 702, in parser_f
return _read(filepath_or_buffer, kwds)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 429, in _read
parser = TextFileReader(filepath_or_buffer, kwds)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 895, in init
self._make_engine(self.engine)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 1122, in _make_engine
self._engine = CParserWrapper(self.f, self.options)
File "/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/pandas/io/parsers.py", line 1853, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 387, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/diamonds.csv' does not exist: b'/mnt/home/bundyjas/anaconda3/envs/ACH_environment/lib/python3.6/site-packages/dfply/data/diamonds.csv'
I can substitute the import line with any of the following and the result is still the same: -import dfply -from dfply import group_by as group_by, summarize as summarize, select as select -from dfply import *
Please help. I cannot seem to use git or pip to correct the problem. Pip tells me the package is already installed, but I get the same problem. Git is not available to me.