The issue was fixed in plotnine v0.10.0, however when I upgrade to that version, it is necessary to upgrade several packages because of pandas and google dependencies:
I get the following error when running make MONTH= all
ModuleNotFoundError: No module named 'siuba.sql.dialects.bigquery'
```
Exception encountered at "In [7]":
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
/tmp/ipykernel_20/2135866344.py in
4 ## start/end dates now dt.date, need to format downstream?...
5 ## collect here for 1 less query, small table after all
----> 6 tbl_dim_feeds = (tbl.views.gtfs_schedule_dim_feeds()
7 >> filter_end
8 >> filter_itp
~/venv/lib/python3.8/site-packages/calitp/tables.py in __call__(self)
79
80 def __call__(self):
---> 81 return self._create_table()
82
83 def _create_table(self):
~/venv/lib/python3.8/site-packages/calitp/tables.py in _create_table(self)
82
83 def _create_table(self):
---> 84 return LazyTbl(self.engine, self.table_name)
85
86 def _row_html(self, col):
~/venv/lib/python3.8/site-packages/siuba/sql/verbs.py in __init__(self, source, tbl, columns, ops, group_by, order_by, funcs, rm_attr, call_sub_attr, dispatch_cls, result_cls)
213
214 dialect = self.source.dialect.name
--> 215 self.funcs = get_dialect_funcs(dialect) if funcs is None else funcs
216 self.dispatch_cls = get_sql_classes(dialect) if dispatch_cls is None else dispatch_cls
217 self.result_cls = result_cls
~/venv/lib/python3.8/site-packages/siuba/sql/utils.py in get_dialect_funcs(name)
3 def get_dialect_funcs(name):
4 #dialect = engine.dialect.name
----> 5 mod = importlib.import_module('siuba.sql.dialects.{}'.format(name))
6 return mod.funcs
7
/usr/local/lib/python3.8/importlib/__init__.py in import_module(name, package)
125 break
126 level += 1
--> 127 return _bootstrap._gcd_import(name[level:], package, level)
128
129
/usr/local/lib/python3.8/importlib/_bootstrap.py in _gcd_import(name, package, level)
/usr/local/lib/python3.8/importlib/_bootstrap.py in _find_and_load(name, import_)
/usr/local/lib/python3.8/importlib/_bootstrap.py in _find_and_load_unlocked(name, import_)
ModuleNotFoundError: No module named 'siuba.sql.dialects.bigquery'
```
I'm not sure why these package updates would cause this error. The `pip list diff between downgrading matlab and upgrading plotnine with minimal package updates is:
Additionally, worth noting, if I try and upgrade to a more recent version of siuba and calitp I get, when running make MONTH=12 all:
unnest errors
```sql
DatabaseError: (google.cloud.bigquery.dbapi.exceptions.DatabaseError) 400 No matching signature for operator IN UNNEST for argument types: DATE, ARRAY at [6:38]
Location: us-west2
Job ID: 5d14922f-68aa-4838-8091-9ad5fa2ad28f
[SQL: SELECT `anon_1`.`name`, `anon_1`.`calitp_extracted_at`
FROM (SELECT `anon_2`.`calitp_itp_id` AS `calitp_itp_id`, `anon_2`.`calitp_url_number` AS `calitp_url_number`, `anon_2`.`calitp_extracted_at` AS `calitp_extracted_at`, `anon_2`.`full_path` AS `full_path`, `anon_2`.`name` AS `name`, `anon_2`.`size` AS `size`, `anon_2`.`md5_hash` AS `md5_hash`, `anon_2`.`is_loadable_file` AS `is_loadable_file`, `anon_2`.`prev_md5_hash` AS `prev_md5_hash`, `anon_2`.`is_changed` AS `is_changed`, `anon_2`.`is_first_extraction` AS `is_first_extraction`, `anon_2`.`is_validation` AS `is_validation`, `anon_2`.`is_agency_changed` AS `is_agency_changed`
FROM (SELECT `gtfs_schedule_history.calitp_files_updates`.`calitp_itp_id` AS `calitp_itp_id`, `gtfs_schedule_history.calitp_files_updates`.`calitp_url_number` AS `calitp_url_number`, `gtfs_schedule_history.calitp_files_updates`.`calitp_extracted_at` AS `calitp_extracted_at`, `gtfs_schedule_history.calitp_files_updates`.`full_path` AS `full_path`, `gtfs_schedule_history.calitp_files_updates`.`name` AS `name`, `gtfs_schedule_history.calitp_files_updates`.`size` AS `size`, `gtfs_schedule_history.calitp_files_updates`.`md5_hash` AS `md5_hash`, `gtfs_schedule_history.calitp_files_updates`.`is_loadable_file` AS `is_loadable_file`, `gtfs_schedule_history.calitp_files_updates`.`prev_md5_hash` AS `prev_md5_hash`, `gtfs_schedule_history.calitp_files_updates`.`is_changed` AS `is_changed`, `gtfs_schedule_history.calitp_files_updates`.`is_first_extraction` AS `is_first_extraction`, `gtfs_schedule_history.calitp_files_updates`.`is_validation` AS `is_validation`, `gtfs_schedule_history.calitp_files_updates`.`is_agency_changed` AS `is_agency_changed`
FROM `gtfs_schedule_history.calitp_files_updates`) AS `anon_2`
WHERE `anon_2`.`calitp_itp_id` = %(calitp_itp_id_1:INT64)s AND `anon_2`.`calitp_url_number` = %(calitp_url_number_1:INT64)s) AS `anon_1`
WHERE `anon_1`.`calitp_extracted_at` IN UNNEST(%(calitp_extracted_at_1:STRING)s)]
[parameters: {'calitp_itp_id_1': 10, 'calitp_url_number_1': 0, 'calitp_extracted_at_1': ['2022-12-04', '2022-12-18']}]
(Background on this error at: https://sqlalche.me/e/14/4xp6)
make: *** [Makefile:22: outputs/2022/12/10/index.ipynb] Error 1
```
This PR works, however it would be good to identify an upgrade path for the report generation packages. I create an issue to test PRs #205 for future changes.
Type of change
[x] Bug fix (non-breaking change which fixes an issue)
[ ] New feature
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
Description
When running the report locally, I get the following error:
The resolution is described in this stackoverflow q/a.
The issue was fixed in plotnine v0.10.0, however when I upgrade to that version, it is necessary to upgrade several packages because of pandas and google dependencies:
Update plotnine diff
```diff diff --git a/requirements.txt b/requirements.txt index e509b845a..8c77b4ea6 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,11 +1,15 @@ aiohttp==3.7.4.post0 ansiwrap==0.8.4 +anyio==3.6.2 appdirs==1.4.4 appnope==0.1.2 argon2-cffi==21.1.0 async-timeout==3.0.1 attrs==21.2.0 +Babel==2.11.0 backcall==0.2.0 +backports.zoneinfo==0.2.1 +beautifulsoup4==4.11.1 bleach==4.1.0 cachetools==4.2.2 calitp==0.0.15 @@ -14,26 +18,31 @@ cffi==1.14.5 cfgv==3.3.0 chardet==4.0.0 click==8.0.1 +contourpy==1.0.6 +cycler==0.11.0 debugpy==1.4.3 decorator==5.0.9 defusedxml==0.7.1 distlib==0.3.2 entrypoints==0.3 +fastjsonschema==2.16.2 filelock==3.0.12 +fonttools==4.38.0 fsspec==2021.6.0 future==0.18.2 gcsfs==0.8.0 -google-api-core==2.7.1 -google-auth==1.31.0 -google-auth-oauthlib==0.4.4 +google-api-core==2.8.0 +google-auth==2.15.0 +google-auth-oauthlib==0.8.0 google-cloud-bigquery==2.34.3 -google-cloud-bigquery-storage==2.7.0 -google-cloud-core==2.2.3 +google-cloud-bigquery-storage==2.13.1 +google-cloud-core==2.3.2 google-crc32c==1.1.2 google-resumable-media==1.3.0 googleapis-common-protos==1.53.0 grpcio==1.44.0 grpcio-status==1.44.0 +gtfs-realtime-bindings==0.0.7 identify==2.2.10 idna==2.10 iniconfig==1.1.1 @@ -44,39 +53,51 @@ ipython-genutils==0.2.0 ipywidgets==7.6.4 jedi==0.18.0 Jinja2==3.0.1 +json5==0.9.11 jsonschema==3.2.0 jupyter==1.0.0 -jupyterlab==3.4.4 jupyter-client==7.0.2 jupyter-console==6.4.0 jupyter-core==4.7.1 +jupyter-server==1.23.4 +jupyterlab==3.4.4 jupyterlab-pygments==0.1.2 +jupyterlab-server==2.10.3 jupyterlab-widgets==1.0.1 +kiwisolver==1.4.4 libcst==0.3.20 +lxml==4.9.2 MarkupSafe==2.0.1 +matplotlib==3.6.2 matplotlib-inline==0.1.2 mistune==0.8.4 +mizani==0.8.1 multidict==5.1.0 mypy-extensions==0.4.3 +nbclassic==0.4.8 nbclient==0.5.4 nbconvert==6.5.1 nbformat==5.4.0 nest-asyncio==1.5.1 nodeenv==1.6.0 notebook==6.4.12 +notebook_shim==0.2.2 numpy==1.22.0 oauthlib==3.2.1 -packaging==20.9 -pandas==1.1.4 +packaging==22.0 +palettable==3.3.0 +pandas==1.5.2 pandas-gbq==0.14.1 pandocfilters==1.4.3 papermill==2.3.4 parso==0.8.2 pathspec==0.9.0 +patsy==0.5.3 pexpect==4.8.0 pickleshare==0.7.5 +Pillow==9.4.0 platformdirs==2.3.0 -plotnine==0.8.0 +plotnine==0.10.1 pluggy==0.13.1 postmarker==0.18.2 prometheus-client==0.11.0 @@ -104,14 +125,20 @@ regex==2021.8.28 requests==2.25.1 requests-oauthlib==1.3.0 rsa==4.7.2 +scipy==1.10.0 Send2Trash==1.8.0 -git+https://github.com/machow/siuba.git@stable +siuba==0.0.25 six==1.16.0 +sniffio==1.3.0 +soupsieve==2.3.2.post1 SQLAlchemy==1.3.24 +sqlalchemy-bigquery==1.5.0 +statsmodels==0.13.5 tenacity==8.0.1 terminado==0.12.1 testpath==0.5.0 textwrap3==0.9.2 +tinycss2==1.2.1 toml==0.10.2 tomli==1.2.1 tornado==6.1 @@ -124,5 +151,6 @@ urllib3==1.26.5 virtualenv==20.4.7 wcwidth==0.2.5 webencodings==0.5.1 +websocket-client==1.4.2 widgetsnbextension==3.5.1 yarl==1.6.3 ```I get the following error when running
make MONTH= all
ModuleNotFoundError: No module named 'siuba.sql.dialects.bigquery'
``` Exception encountered at "In [7]": --------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent call last) /tmp/ipykernel_20/2135866344.py inI'm not sure why these package updates would cause this error. The
`pip list
diff between downgrading matlab and upgrading plotnine with minimal package updates is:working < not working diff
```diff diff working-matlab-3.5.3.txt broken-plotnine-10-minimal-ups.txt 12a13 > backports.zoneinfo 0.2.1 21a23 > contourpy 1.0.6 26d27 < descartes 1.1.0 35,37c36,38 < google-api-core 2.7.1 < google-auth 1.31.0 < google-auth-oauthlib 0.4.4 --- > google-api-core 2.8.0 > google-auth 2.15.0 > google-auth-oauthlib 0.8.0 39,40c40,41 < google-cloud-bigquery-storage 2.7.0 < google-cloud-core 2.2.3 --- > google-cloud-bigquery-storage 2.13.1 > google-cloud-core 2.3.2 72c73 < matplotlib 3.5.3 --- > matplotlib 3.6.2 75c76 < mizani 0.7.3 --- > mizani 0.8.1 88c89 < packaging 20.9 --- > packaging 22.0 90c91 < pandas 1.1.4 --- > pandas 1.5.2 102c103 < plotnine 0.8.0 --- > plotnine 0.10.1 139c140 < statsmodels 0.13.1 --- > statsmodels 0.13.5 ```Additionally, worth noting, if I try and upgrade to a more recent version of siuba and calitp I get, when running
make MONTH=12 all
:unnest errors
```sql DatabaseError: (google.cloud.bigquery.dbapi.exceptions.DatabaseError) 400 No matching signature for operator IN UNNEST for argument types: DATE, ARRAYThis PR works, however it would be good to identify an upgrade path for the report generation packages. I create an issue to test PRs #205 for future changes.
Type of change
How has this been tested?
Tested locally through the docker container.