Closed lwrubel closed 1 month ago
Failed again with:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/models/taskinstance.py", line 465, in _execute_task
result = _execute_callable(context=context, **execute_callable_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/models/taskinstance.py", line 432, in _execute_callable
return execute_callable(context=context, **execute_callable_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/models/baseoperator.py", line 401, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/decorators/base.py", line 265, in execute
return_value = super().execute(context)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/models/baseoperator.py", line 401, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/operators/python.py", line 235, in execute
return_value = self.execute_callable()
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/operators/python.py", line 252, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/airflow/rialto_airflow/dags/harvest.py", line 95, in openalex_harvest_pubs
openalex.publications_csv(dois, csv_file)
File "/opt/airflow/rialto_airflow/harvest/openalex.py", line 76, in publications_csv
for pub in publications_from_dois(dois):
File "/opt/airflow/rialto_airflow/harvest/openalex.py", line 89, in publications_from_dois
for page in Works().filter(doi=doi_list).paginate(per_page=200):
File "/home/airflow/.local/lib/python3.12/site-packages/pyalex/api.py", line 147, in __next__
results, meta = self.endpoint_class.get(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/pyalex/api.py", line 293, in get
return self._get_from_url(self.url, return_meta=return_meta)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/pyalex/api.py", line 265, in _get_from_url
raise QueryError(res.json()["message"])
pyalex.api.QueryError: 4_2|10.58530/2022/1705|10.7910/dvn/b4uo2l/grm2dr|10.1016/j.carbon.2015.01.017|10.1016/j.jnoncrysol.2014.04.003|10.1053/j.jrn.2010.09.007|10.1186/gb-2010-11-1-r10|10.2312/vcbm.20161282|10.1002/jmri.23614|10.1007/s00401-018-1859-2|10.1145/3322126|10.7910/dvn/b4uo2l/rkccfs|10.1152/ajpheart.00790.2002|10.1007/978-981-15-3449-2_6|10.1056/nejmoa032520|10.1128/aem.03950-14|10.1109/ectc.2018.00061|10.1016/j.healun.2015.10.039|10.3847/1538-4357/ac0053|10.1038/leu.2011.213|10.1353/rus.2016.0005|10.1002/adfm.201201848|10.1007/s10955-019-02249-9|10.1016/j.athoracsur.2021.07.058|10.21203/rs.3.rs-2883579/v1|10.1115/imece1998-0246|10.7910/dvn/ttlqrn/o1w8s7|10.1038/s41598-022-21510-y|10.7910/dvn/8dushz/budenx|10.7189/jogh.14.04011|10.1126/sciimmunol.aat8116|10.3802/jgo.2021.32.e14|10.1111/j.1552-6909.2012.01379.x|10.1182/blood-2018-99-113328|10.1145/3449101|10.1021/acsnano.5b02432|10.1016/j.jss.2024.01.009|10.1046/j.1440-0952.2002.00932.x|10.1101/767988|10.1016/j.gca.2022.12.008|10.1089/wound.2016.0709 is not a valid parameter. Valid parameters are: apc_sum, cited_by_count_sum, cursor, filter, format, group_by, group-by, group_bys, group-bys, mailto, page, per_page, per-page, q, sample, seed, search, select, sort.
Closing since we haven't seen this in recent runs.
When running with a dev_limit of 1000, the openalex_harvest_pub task sometimes fails with: