Open jmaschino56 opened 1 year ago
This issue is not clear, could you give a more detailed description on how to get into it?
Thanks
It looks like I am getting the same issue when using the statcast
api like so:
Example:
df = pyb.statcast(
start_dt="2016-10-01",
end_dt="2016-10-31",
)
Stacktrace:
File "/var/task/pybaseball/statcast.py", line 113, in statcast
return _handle_request(start_dt_date, end_dt_date, 1, verbose=verbose,
File "/var/task/pybaseball/statcast.py", line 76, in _handle_request
dataframe_list.append(future.result())
File "/var/lang/lib/python3.9/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/var/lang/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
File "/var/lang/lib/python3.9/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/var/task/pybaseball/cache/cache.py", line 58, in _cached
result = func(*args, **kwargs)
File "/var/task/pybaseball/statcast.py", line 24, in _small_request
data = statcast_ds.get_statcast_data_from_csv_url(
File "/var/task/pybaseball/cache/cache.py", line 58, in _cached
result = func(*args, **kwargs)
File "/var/task/pybaseball/datasources/statcast.py", line 23, in get_statcast_data_from_csv_url
return get_statcast_data_from_csv(
File "/var/task/pybaseball/datasources/statcast.py", line 35, in get_statcast_data_from_csv
data = pd.read_csv(io.StringIO(csv_content))
File "/var/task/pandas/util/_decorators.py", line 211, in wrapper
return func(*args, **kwargs)
File "/var/task/pandas/util/_decorators.py", line 331, in wrapper
return func(*args, **kwargs)
File "/var/task/pandas/io/parsers/readers.py", line 950, in read_csv
return _read(filepath_or_buffer, kwds)
File "/var/task/pandas/io/parsers/readers.py", line 611, in _read
return parser.read(nrows)
File "/var/task/pandas/io/parsers/readers.py", line 1778, in read
) = self._engine.read( # type: ignore[attr-defined]
File "/var/task/pandas/io/parsers/c_parser_wrapper.py", line 230, in read
chunks = self._reader.read_low_memory(nrows)
File "pandas/_libs/parsers.pyx", line 808, in pandas._libs.parsers.TextReader.read_low_memory
chunk = self._read_rows(self.buffer_lines, 0)
File "pandas/_libs/parsers.pyx", line 866, in pandas._libs.parsers.TextReader._read_rows
self._tokenize_rows(irows - buffered_lines)
File "pandas/_libs/parsers.pyx", line 852, in pandas._libs.parsers.TextReader._tokenize_rows
raise_parser_error('Error tokenizing data', self.parser)
File "pandas/_libs/parsers.pyx", line 1973, in pandas._libs.parsers.raise_parser_error
raise ParserError(message)
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 13, saw 2
EDIT: I think the issue is the statcast api returning an error when trying to access the CSV.
I can't reproduce this issue, could someone who is hitting it run "pip list" in their env.
Encountering same issue:
Package Version
absl-py 1.4.0 alabaster 0.7.13 albumentations 1.2.1 altair 4.2.2 anyio 3.6.2 appdirs 1.4.4 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 array-record 0.2.0 arviz 0.15.1 astropy 5.2.2 astunparse 1.6.3 attrs 23.1.0 audioread 3.0.0 autograd 1.5 Babel 2.12.1 backcall 0.2.0 beautifulsoup4 4.11.2 bleach 6.0.0 blis 0.7.9 blosc2 2.0.0 bokeh 2.4.3 branca 0.6.0 build 0.10.0 CacheControl 0.12.11 cached-property 1.5.2 cachetools 5.3.0 catalogue 2.0.8 certifi 2022.12.7 cffi 1.15.1 chardet 4.0.0 charset-normalizer 2.0.12 chex 0.1.7 click 8.1.3 cloudpickle 2.2.1 cmake 3.25.2 cmdstanpy 1.1.0 colorcet 3.0.1 colorlover 0.3.0 community 1.0.0b1 confection 0.0.4 cons 0.4.5 contextlib2 0.6.0.post1 contourpy 1.0.7 convertdate 2.4.0 cryptography 40.0.2 cufflinks 0.17.3 cupy-cuda11x 11.0.0 cvxopt 1.3.0 cvxpy 1.3.1 cycler 0.11.0 cymem 2.0.7 Cython 0.29.34 dask 2022.12.1 datascience 0.17.6 db-dtypes 1.1.1 dbus-python 1.2.16 debugpy 1.6.6 decorator 4.4.2 defusedxml 0.7.1 Deprecated 1.2.14 distributed 2022.12.1 dlib 19.24.1 dm-tree 0.1.8 docutils 0.16 dopamine-rl 4.0.6 duckdb 0.7.1 earthengine-api 0.1.350 easydict 1.10 ecos 2.0.12 editdistance 0.6.2 en-core-web-sm 3.5.0 entrypoints 0.4 ephem 4.1.4 et-xmlfile 1.1.0 etils 1.2.0 etuples 0.3.8 exceptiongroup 1.1.1 fastai 2.7.12 fastcore 1.5.29 fastdownload 0.0.7 fastjsonschema 2.16.3 fastprogress 1.0.3 fastrlock 0.8.1 filelock 3.12.0 firebase-admin 5.3.0 Flask 2.2.4 flatbuffers 23.3.3 flax 0.6.9 folium 0.14.0 fonttools 4.39.3 frozendict 2.3.7 fsspec 2023.4.0 future 0.18.3 gast 0.4.0 GDAL 3.3.2 gdown 4.6.6 gensim 4.3.1 geographiclib 2.0 geopy 2.3.0 gin-config 0.5.0 glob2 0.7 google 2.0.3 google-api-core 2.11.0 google-api-python-client 2.84.0 google-auth 2.17.3 google-auth-httplib2 0.1.0 google-auth-oauthlib 1.0.0 google-cloud-bigquery 3.9.0 google-cloud-bigquery-storage 2.19.1 google-cloud-core 2.3.2 google-cloud-datastore 2.15.1 google-cloud-firestore 2.11.0 google-cloud-language 2.9.1 google-cloud-storage 2.8.0 google-cloud-translate 3.11.1 google-colab 1.0.0 google-crc32c 1.5.0 google-pasta 0.2.0 google-resumable-media 2.5.0 googleapis-common-protos 1.59.0 googledrivedownloader 0.4 graphviz 0.20.1 greenlet 2.0.2 grpcio 1.54.0 grpcio-status 1.48.2 gspread 3.4.2 gspread-dataframe 3.0.8 gym 0.25.2 gym-notices 0.0.8 h5netcdf 1.1.0 h5py 3.8.0 holidays 0.25 holoviews 1.15.4 html5lib 1.1 httpimport 1.3.0 httplib2 0.21.0 huggingface-hub 0.15.1 humanize 4.6.0 hyperopt 0.2.7 idna 3.4 imageio 2.25.1 imageio-ffmpeg 0.4.8 imagesize 1.4.1 imbalanced-learn 0.10.1 imgaug 0.4.0 importlib-resources 5.12.0 imutils 0.5.4 inflect 6.0.4 iniconfig 2.0.0 intel-openmp 2023.1.0 ipykernel 5.5.6 ipython 7.34.0 ipython-genutils 0.2.0 ipython-sql 0.4.1 ipywidgets 7.7.1 itsdangerous 2.1.2 jax 0.4.10 jaxlib 0.4.10+cuda11.cudnn86 jieba 0.42.1 Jinja2 3.1.2 joblib 1.2.0 jsonpickle 3.0.1 jsonschema 4.3.3 jupyter-client 6.1.12 jupyter-console 6.1.0 jupyter_core 5.3.0 jupyter-server 1.24.0 jupyterlab-pygments 0.2.2 jupyterlab-widgets 3.0.7 kaggle 1.5.13 keras 2.12.0 kiwisolver 1.4.4 korean-lunar-calendar 0.3.1 langcodes 3.3.0 lazy_loader 0.2 libclang 16.0.0 librosa 0.10.0.post2 lightgbm 3.3.5 lit 16.0.5 llvmlite 0.39.1 locket 1.0.0 logical-unification 0.4.5 LunarCalendar 0.0.9 lxml 4.9.2 Markdown 3.4.3 markdown-it-py 2.2.0 MarkupSafe 2.1.2 matplotlib 3.7.1 matplotlib-inline 0.1.6 matplotlib-venn 0.11.9 mdurl 0.1.2 miniKanren 1.0.3 missingno 0.5.2 mistune 0.8.4 mizani 0.8.1 mkl 2019.0 ml-dtypes 0.1.0 mlxtend 0.14.0 more-itertools 9.1.0 moviepy 1.0.3 mpmath 1.3.0 msgpack 1.0.5 multipledispatch 0.6.0 multitasking 0.0.11 murmurhash 1.0.9 music21 8.1.0 natsort 8.3.1 nbclient 0.7.4 nbconvert 6.5.4 nbformat 5.8.0 nest-asyncio 1.5.6 networkx 3.1 nibabel 3.0.2 nltk 3.8.1 notebook 6.4.8 numba 0.56.4 numexpr 2.8.4 numpy 1.22.4 oauth2client 4.1.3 oauthlib 3.2.2 opencv-contrib-python 4.7.0.72 opencv-python 4.7.0.72 opencv-python-headless 4.7.0.72 openpyxl 3.0.10 opt-einsum 3.3.0 optax 0.1.5 orbax-checkpoint 0.2.1 osqp 0.6.2.post8 packaging 23.1 palettable 3.3.3 pandas 1.5.3 pandas-datareader 0.10.0 pandas-gbq 0.17.9 pandocfilters 1.5.0 panel 0.14.4 param 1.13.0 parso 0.8.3 partd 1.4.0 pathlib 1.0.1 pathy 0.10.1 patsy 0.5.3 pexpect 4.8.0 pickleshare 0.7.5 Pillow 8.4.0 pip 23.1.2 pip-tools 6.13.0 platformdirs 3.3.0 plotly 5.13.1 plotnine 0.10.1 pluggy 1.0.0 polars 0.17.3 pooch 1.6.0 portpicker 1.3.9 prefetch-generator 1.0.3 preshed 3.0.8 prettytable 0.7.2 proglog 0.1.10 progressbar2 4.2.0 prometheus-client 0.16.0 promise 2.3 prompt-toolkit 3.0.38 prophet 1.1.3 proto-plus 1.22.2 protobuf 3.20.3 psutil 5.9.5 psycopg2 2.9.6 ptyprocess 0.7.0 py-cpuinfo 9.0.0 py4j 0.10.9.7 pyarrow 9.0.0 pyasn1 0.5.0 pyasn1-modules 0.3.0 pybaseball 2.2.5 pycocotools 2.0.6 pycparser 2.21 pyct 0.5.0 pydantic 1.10.7 pydata-google-auth 1.7.0 pydot 1.4.2 pydot-ng 2.0.0 pydotplus 2.0.2 PyDrive 1.3.1 pyerfa 2.0.0.3 pygame 2.3.0 PyGithub 1.58.2 Pygments 2.14.0 PyGObject 3.36.0 PyJWT 2.7.0 pymc 5.1.2 PyMeeus 0.5.12 pymystem3 0.2.0 PyNaCl 1.5.0 PyOpenGL 3.1.6 pyparsing 3.0.9 pyproject_hooks 1.0.0 pyrsistent 0.19.3 PySocks 1.7.1 pytensor 2.10.1 pytest 7.2.2 python-apt 0.0.0 python-dateutil 2.8.2 python-louvain 0.16 python-slugify 8.0.1 python-utils 3.5.2 pytz 2022.7.1 pytz-deprecation-shim 0.1.0.post0 pyviz-comms 2.2.1 PyWavelets 1.4.1 PyYAML 6.0 pyzmq 23.2.1 qdldl 0.1.7 qudida 0.0.4 regex 2022.10.31 requests 2.27.1 requests-oauthlib 1.3.1 requests-unixsocket 0.2.0 requirements-parser 0.5.0 rich 13.3.4 rpy2 3.5.5 rsa 4.9 safetensors 0.3.1 scikit-image 0.19.3 scikit-learn 1.2.2 scipy 1.10.1 scs 3.2.3 seaborn 0.12.2 Send2Trash 1.8.0 setuptools 67.7.2 shapely 2.0.1 six 1.16.0 sklearn-pandas 2.2.0 smart-open 6.3.0 sniffio 1.3.0 snowballstemmer 2.2.0 sortedcontainers 2.4.0 soundfile 0.12.1 soupsieve 2.4.1 soxr 0.3.5 spacy 3.5.2 spacy-legacy 3.0.12 spacy-loggers 1.0.4 Sphinx 3.5.4 sphinxcontrib-applehelp 1.0.4 sphinxcontrib-devhelp 1.0.2 sphinxcontrib-htmlhelp 2.0.1 sphinxcontrib-jsmath 1.0.1 sphinxcontrib-qthelp 1.0.3 sphinxcontrib-serializinghtml 1.1.5 SQLAlchemy 2.0.10 sqlparse 0.4.4 srsly 2.4.6 statsmodels 0.13.5 sympy 1.11.1 tables 3.8.0 tabulate 0.8.10 tblib 1.7.0 tenacity 8.2.2 tensorboard 2.12.2 tensorboard-data-server 0.7.0 tensorboard-plugin-wit 1.8.1 tensorflow 2.12.0 tensorflow-datasets 4.9.2 tensorflow-estimator 2.12.0 tensorflow-gcs-config 2.12.0 tensorflow-hub 0.13.0 tensorflow-io-gcs-filesystem 0.32.0 tensorflow-metadata 1.13.1 tensorflow-probability 0.20.1 tensorstore 0.1.36 termcolor 2.3.0 terminado 0.17.1 text-unidecode 1.3 textblob 0.17.1 tf-slim 1.1.0 thinc 8.1.9 threadpoolctl 3.1.0 tifffile 2023.4.12 tinycss2 1.2.1 tokenizers 0.13.3 toml 0.10.2 tomli 2.0.1 toolz 0.12.0 torch 2.0.1+cu118 torchaudio 2.0.2+cu118 torchdata 0.6.1 torchsummary 1.5.1 torchtext 0.15.2 torchvision 0.15.2+cu118 tornado 6.3.1 tqdm 4.65.0 traitlets 5.7.1 transformers 4.30.2 triton 2.0.0 tweepy 4.13.0 typer 0.7.0 types-setuptools 67.8.0.0 typing_extensions 4.5.0 tzdata 2023.3 tzlocal 4.3 uritemplate 4.1.1 urllib3 1.26.15 vega-datasets 0.9.0 wasabi 1.1.1 wcwidth 0.2.6 webcolors 1.13 webencodings 0.5.1 websocket-client 1.5.1 Werkzeug 2.3.0 wheel 0.40.0 widgetsnbextension 3.6.4 wordcloud 1.8.2.2 wrapt 1.14.1 xarray 2022.12.0 xarray-einstats 0.5.1 xgboost 1.7.5 xlrd 2.0.1 yellowbrick 1.5 yfinance 0.2.18 zict 3.0.0 zipp 3.15.0
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 13, saw 2
Stems from get_statcast_data_from_csv. Happens on 2022-07-17