GSA / data.gov

Main repository for the data.gov service
https://data.gov
Other
634 stars 100 forks source link

Dashboard data.json validation handles large file upload #3840

Closed FuhuXia closed 1 month ago

FuhuXia commented 2 years ago

When upload a large data.json file (150MB) to https://dashboard.data.gov/validate to validate, the server generates a 500 error.

It should be able to handle data.json with file size up to 200MB,

How to reproduce

Go to https://dashboard.data.gov/validate, go to Validate data.json file upload, choose a sample data.json file of 150 MB in size, click Validate File.

Expected behavior

Successful validation result

Actual behavior

500 Error.

2022-06-01T09:46:16.83-0400 [RTR/3] OUT dashboard-dev-datagov.app.cloud.gov - [2022-06-01T13:46:10.802594856Z] "POST /validate HTTP/1.1" 503 276040359 299 "https://dashboard-dev-datagov.app.cloud.gov/validate" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36" "127.0.0.1:16350" "10.10.2.28:61144" x_forwarded_for:"2600:4040:2855:9c00:177:e2f0:1c0f:906f, 127.0.0.1" x_forwarded_proto:"https" vcap_request_id:"598f6394-9ae8-4b17-6850-2389a0024936" response_time:6.033775 gorouter_time:0.000204 app_id:"32340948-9329-45f2-9766-a38b36530af5" app_index:"0" instance_id:"197d16a3-e7ab-47b5-64cb-9f9c" x_cf_routererror:"-" x_b3_traceid:"5992859efd7c7770288ef14fb0f6de7d" x_b3_spanid:"288ef14fb0f6de7d" x_b3_parentspanid:"-" b3:"5992859efd7c7770288ef14fb0f6de7d-288ef14fb0f6de7d"
2022-06-01T09:46:16.84-0400 [APP/PROC/WEB/0] OUT 13:46:16 httpd   | [Wed Jun 01 13:46:16.822928 2022] [proxy_fcgi:error] [pid 298:tid 140155741161216] [client 127.0.0.1:0] AH01067: Failed to read FastCGI header, referer: https://dashboard-dev-datagov.app.cloud.gov/validate
2022-06-01T09:46:16.84-0400 [APP/PROC/WEB/0] OUT 13:46:16 php-fpm | [01-Jun-2022 13:46:16] WARNING: [pool www] child 326 exited on signal 9 (SIGKILL) after 75967.989628 seconds from start
2022-06-01T09:46:16.84-0400 [APP/PROC/WEB/0] OUT 13:46:16 httpd   | [Wed Jun 01 13:46:16.829251 2022] [proxy_fcgi:error] [pid 298:tid 140155741161216] (104)Connection reset by peer: [client 127.0.0.1:0] AH01075: Error dispatching request to : , referer: https://dashboard-dev-datagov.app.cloud.gov/validate
2022-06-01T09:46:16.84-0400 [APP/PROC/WEB/0] OUT 13:46:16 php-fpm | [01-Jun-2022 13:46:16] NOTICE: [pool www] child 347 started
2022-06-01T09:46:16.84-0400 [APP/PROC/WEB/0] OUT 13:46:16 httpd   | 127.0.0.1 - - [01/Jun/2022:13:46:10 +0000] "POST /validate HTTP/1.1" 503 299 vcap_request_id=598f6394-9ae8-4b17-6850-2389a0024936 peer_addr=10.255.93.60

Sketch

Adding memory_limit="500M" to php config (such as https://github.com/GSA/project-open-data-dashboard/blob/main/.bp-config/php/php.ini.d/uploads.ini) is necessary to bypass some initial error, then we see this [proxy_fcgi:error] ... (104)Connection reset by peer, seems to be related to timeout setttings.

FuhuXia commented 1 year ago

Tested with the latest, it gives 500 error for 32M file. It is ok with 16M.

2023-02-17T10:41:10.80-0500 [APP/PROC/WEB/0] OUT 15:41:10 httpd   | [Fri Feb 17 15:41:10.777457 2023] 
[proxy_fcgi:error] [pid 268:tid 140657240917760] [client 127.0.0.1:0] AH01071: Got error 
'PHP message: PHP Fatal error:  Allowed memory size of 134217728 bytes exhausted (tried to allocate 4096 bytes)
in /home/vcap/app/application/models/Campaign_model.php on line 1191PHP message: PHP Fatal error: 
Allowed memory size of 134217728 bytes exhausted (tried to allocate 4096 bytes) in 
/home/vcap/app/vendor/codeigniter/framework/system/core/Common.php on line 570'
btylerburton commented 1 month ago

dashboard is deleted