sul-dlss / libsys-airflow

Airflow DAGS for migrating and managing ILS data into FOLIO along with other LibSys workflows
Apache License 2.0
5 stars 0 forks source link

Returns a list of lists for invoice_lines_paid_on_fund task. #1383

Closed shelleydoljack closed 3 weeks ago

shelleydoljack commented 3 weeks ago

Fixes #1335

When triggering the digital_bookplate_instances DAG with a conf with a new fund, the funds branch will be used so that invoice_lines_paid_on_fund task will query folio for the fund_uuid in fundDistributions list of paid invoice lines. In some cases, there are more invoice lines than can be mapped to dynamic tasks (currently in prod there are 2,026 paid invoice lines with the LINDER fund). I've changed invoice_lines_paid_on_fund task to return a list of lists, e.g.

[
  [{folio invoice line dict}, {folio invoice line dict}, {folio invoice line dict}, ...],
  [{folio invoice line dict}, {folio invoice line dict}, {folio invoice line dict}, ...]
]

If all paid invoice lines for a particular fund is greater than 1,000, then each inner list of folio invoice line dicts will be 100. For all others it will be 5. I am a little worried that we will see a lot of dropped connections once we get to the instances_from_po_lines task for each 100 paid invoice lines, since we will be hitting okapi hard at this point and possibly seeing dropped connections for the connection pooling bug in httpx implementation of FolioClient.

shelleydoljack commented 3 weeks ago

Also, thinking about this more, we could deploy to prod w/out this PR and let LINDER fund fail, then deploy to prod with PR merged and trigger digital_bookplate_instances with just LINDER fund. That way we know more of what to expect in terms of the dags running in the form that we had them running on airflow-stage.

shelleydoljack commented 3 weeks ago

We'll merge this after we deploy to prod and run the digital bookplates dags to completion.