fedspendingtransparency / usaspending-api

Server application to serve U.S. federal spending data via a RESTful API
https://www.usaspending.gov
Creative Commons Zero v1.0 Universal
306 stars 109 forks source link

Failure when downloading Account Data #4124

Open dev-acc-10000 opened 3 months ago

dev-acc-10000 commented 3 months ago

Currently we are trying to download treasury account data for agencies for all budget functions, we want to download data for 2022-2024 and for all file types. We realized the file generation takes time, so we are sending multiple requests which we break down at budget function level. So a request would be for all agency upto period 12 for budget function ###. What we are noticing is there are some requests that succeed while some fail with the error An exception was raised while attempting to process the DownloadJob:\nTraceback (most recent call last):\n File \"/data-act/backend/usaspending_api/download/filestreaming/download_generation.py\", line 97, in generate_download\n parse_source(\n File \"/data-act/backend/usaspending_api/download/filestreaming/download_generation.py\", line 430, in parse_source\n raise e\n File \"/data-act/backend/usaspending_api/download/filestreaming/download_generation.py\", line 405, in parse_source\n wait_for_process(psql_process, start_time, download_job)\n File \"/data-act/backend/usaspending_api/download/filestreaming/download_generation.py\", line 539, in wait_for_process\n raise e\nException: Command failed. Please see the logs for details.\n We are also noticing files with a small number of rows take quite a long time (the assumption is jobs are added to some kind of queue) ex. https://api.usaspending.gov/api/v2/download/status?file_name=FY2024P01-P12_All_TAS_AccountData_2024-07-02_H19M40S11034789.zip this request fails eventually but the total number of rows are 0. Attaching sample requests made below , any guidance would be appreciated

Request that succeeded curl -X POST \ https://api.usaspending.gov/api/v2/download/accounts \ -H 'Content-Type: application/json' \ -d '{ "account_level": "treasury_account", "file_format": "csv", "filters": { "agency": "all", "fy": "2022", "budget_function": "900", "period": "12", "submission_types": ["account_balances", "object_class_program_activity", "award_financial"] } }'

Request that failed (the request itself succeeds but eventually the file status show the failure above) curl -X POST \ https://api.usaspending.gov/api/v2/download/accounts \ -H 'Content-Type: application/json' \ -d '{ "account_level": "treasury_account", "file_format": "csv", "filters": { "agency": "all", "fy": "2024", "budget_function": "250", "period": "12", "submission_types": ["account_balances", "object_class_program_activity", "award_financial"] } }'

jryoo3 commented 2 months ago

Hi USAspending API team @dporth-frb @aguest-kc @ayubshahab @collinwr @boozallendanny, I wanted to check in on the status of this issue. Is it possible to prioritize resolving this issue? This issue is impacting our current development workflow, and a timely fix would be greatly appreciated. Thank you!