project-koku / korekuta

Read Only Please See: https://github.com/project-koku/korekuta-operator
https://github.com/project-koku/korekuta-operator
GNU Affero General Public License v3.0
6 stars 1 forks source link

implement #122 - splitting files #133

Closed blentz closed 4 years ago

blentz commented 4 years ago

This PR implements the requested behavior in #122 - to split files up when they're too large. Due to the limitations in Ansible 2.9 and earlier, I've reworked a portion of the playbook into a python script.

The new script handles the entire tarball generation process, including checking file size and ensuring the resulting tarball is below the requested size limit. The script will split up the CSVs when either any single file or the sum total size exceeds the requested limit. When it splits files, the script preserves the CSV header and ensures each split file contains the header from the original CSV.

Test case: Original behavior is preserved when there is nothing to split

TASK [collect : Run packaging script to prepare reports for sending to Insights] ***************************************************************************
changed: [localhost] => {"changed": true, "cmd": ["roles/collect/files/package_report.py", "-f", "/tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe", "--ocp-api", "REDACTED", "--metering-api", "REDACTED", "--ocp-cli", "/usr/local/bin/oc", "--ocp-cluster-id", "2cdf3fa1-4ea1-4046-b110-eee40d770efe", "--ocp-namespace", "openshift-metering", "--ocp-token-path", "/root/token", "--overwrite"], "delta": "0:00:00.368503", "end": "2020-02-26 17:47:38.190599", "rc": 0, "start": "2020-02-26 17:47:37.822096", "stderr": "", "stderr_lines": [], "stdout": "/tmp/korekuta-collect/korekuta.tar.gz", "stdout_lines": ["/tmp/korekuta-collect/korekuta.tar.gz"]}

TASK [collect : Send payload to the Insights Client] *******************************************************************************************************
changed: [localhost] => (item=/tmp/korekuta-collect/korekuta.tar.gz) => {"ansible_loop_var": "item", "changed": true, "cmd": ["/usr/bin/insights-client", "--payload=/tmp/korekuta-collect/korekuta.tar.gz", "--content-type=application/vnd.redhat.hccm.tar+tgz"], "delta": "0:00:04.558527", "end": "2020-02-26 17:47:43.238074", "item": "/tmp/korekuta-collect/korekuta.tar.gz", "rc": 0, "start": "2020-02-26 17:47:38.679547", "stderr": "Uploading Insights data.\nSuccessfully uploaded report for a02b35b563fd.", "stderr_lines": ["Uploading Insights data.", "Successfully uploaded report for a02b35b563fd."], "stdout": "", "stdout_lines": []}

TASK [collect : Remove temp files] *************************************************************************************************************************
skipping: [localhost] => {"changed": false, "skip_reason": "Conditional result was False"}

TASK [collect : Remove temp config] ************************************************************************************************************************
skipping: [localhost] => {"changed": false, "skip_reason": "Conditional result was False"}

TASK [collect : Remove tarball] ****************************************************************************************************************************
skipping: [localhost] => {"changed": false, "skip_reason": "Conditional result was False"}

PLAY RECAP *************************************************************************************************************************************************
localhost                  : ok=20   changed=3    unreachable=0    failed=0    skipped=9    rescued=0    ignored=0   

Execution complete.

Test case: running the script on a file that needs splitting (-vv flags added to show script operations)

(korekuta) [root@a02b35b563fd korekuta]# ls -lh /tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/
total 1.6G
-rw-r--r--. 1 root root 1008K Feb 26 21:01 bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.0.csv
-rw-r--r--. 1 root root   18K Feb 26 21:01 bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.1.csv
-rw-r--r--. 1 root root   12K Feb 26 21:01 bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.2.csv
-rw-r--r--. 1 root root  984M Feb 26 21:23 bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3.csv
-rw-r--r--. 1 root root   336 Feb 26 21:20 manifest.json
(korekuta) [root@a02b35b563fd korekuta]# roles/collect/files/package_report.py -f /tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe --max-size 99 --ocp-api REDACTED --metering-api REDACTED --ocp-cli /usr/local/bin/oc --ocp-cluster-id 2cdf3fa1-4ea1-4046-b110-eee40d770efe --ocp-metering-namespace openshift-metering --ocp-token-file /root/token --overwrite -vv
2020-02-26 21:24:17,226 [INFO] manifest generated
2020-02-26 21:24:17,227 [INFO] Writing new file: /tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_1.csv
2020-02-26 21:24:22,319 [INFO] Writing new file: /tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_2.csv
2020-02-26 21:24:27,292 [INFO] Writing new file: /tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_3.csv
2020-02-26 21:24:32,190 [INFO] Writing new file: /tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_4.csv
2020-02-26 21:24:37,054 [INFO] Writing new file: /tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_5.csv
2020-02-26 21:24:41,986 [INFO] Writing new file: /tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_6.csv
2020-02-26 21:24:46,938 [INFO] Writing new file: /tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_7.csv
2020-02-26 21:24:51,869 [INFO] Writing new file: /tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_8.csv
2020-02-26 21:24:56,880 [INFO] Writing new file: /tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_9.csv
2020-02-26 21:25:01,798 [INFO] Writing new file: /tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_10.csv
2020-02-26 21:25:06,701 [INFO] Writing new file: /tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_11.csv
2020-02-26 21:25:07,825 [INFO] Split files: ['/tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_1.csv', '/tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_2.csv', '/tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_3.csv', '/tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_4.csv', '/tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_5.csv', '/tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_6.csv', '/tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_7.csv', '/tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_8.csv', '/tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_9.csv', '/tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_10.csv', '/tmp/korekuta-collect/2cdf3fa1-4ea1-4046-b110-eee40d770efe/bd330bb2-f7f6-5907-acfd-692595268723_openshift_usage_report.3_11.csv']
2020-02-26 21:25:10,304 [INFO] Wrote: /tmp/korekuta-collect/korekuta_0.tar.gz
2020-02-26 21:25:12,666 [INFO] Wrote: /tmp/korekuta-collect/korekuta_1.tar.gz
2020-02-26 21:25:12,691 [INFO] Wrote: /tmp/korekuta-collect/korekuta_2.tar.gz
2020-02-26 21:25:13,177 [INFO] Wrote: /tmp/korekuta-collect/korekuta_3.tar.gz
2020-02-26 21:25:15,575 [INFO] Wrote: /tmp/korekuta-collect/korekuta_4.tar.gz
2020-02-26 21:25:17,956 [INFO] Wrote: /tmp/korekuta-collect/korekuta_5.tar.gz
2020-02-26 21:25:20,343 [INFO] Wrote: /tmp/korekuta-collect/korekuta_6.tar.gz
2020-02-26 21:25:20,344 [INFO] Wrote: /tmp/korekuta-collect/korekuta_7.tar.gz
2020-02-26 21:25:22,739 [INFO] Wrote: /tmp/korekuta-collect/korekuta_8.tar.gz
2020-02-26 21:25:25,185 [INFO] Wrote: /tmp/korekuta-collect/korekuta_9.tar.gz
2020-02-26 21:25:27,577 [INFO] Wrote: /tmp/korekuta-collect/korekuta_10.tar.gz
2020-02-26 21:25:27,578 [INFO] Wrote: /tmp/korekuta-collect/korekuta_11.tar.gz
2020-02-26 21:25:29,949 [INFO] Wrote: /tmp/korekuta-collect/korekuta_12.tar.gz
2020-02-26 21:25:32,332 [INFO] Wrote: /tmp/korekuta-collect/korekuta_14.tar.gz
/tmp/korekuta-collect/korekuta_0.tar.gz
/tmp/korekuta-collect/korekuta_1.tar.gz
/tmp/korekuta-collect/korekuta_2.tar.gz
/tmp/korekuta-collect/korekuta_3.tar.gz
/tmp/korekuta-collect/korekuta_4.tar.gz
/tmp/korekuta-collect/korekuta_5.tar.gz
/tmp/korekuta-collect/korekuta_6.tar.gz
/tmp/korekuta-collect/korekuta_7.tar.gz
/tmp/korekuta-collect/korekuta_8.tar.gz
/tmp/korekuta-collect/korekuta_9.tar.gz
/tmp/korekuta-collect/korekuta_10.tar.gz
/tmp/korekuta-collect/korekuta_11.tar.gz
/tmp/korekuta-collect/korekuta_12.tar.gz
/tmp/korekuta-collect/korekuta_14.tar.gz