project-koku / masu

This is a READ ONLY repo. See https://github.com/project-koku/koku for current masu implementation
GNU Affero General Public License v3.0
5 stars 6 forks source link

Catch masu-listener line item processing exceptions #463

Closed dccurtis closed 5 years ago

dccurtis commented 5 years ago

Addresses #462

The asyncio method run_in_executor should be ran in a try/catch block so that any exceptions that are thrown unblocks the event loop.

Testing

  1. Make a model change to simulate the issue that was seen in issue #462. In this case removing the setup_complete from theProvider model was easy to do.
  2. Ingest OCP payload and observe the exception occurring. Verify that the payload could be sent and processed two times. (before this change the second upload of the payload would never be processed since the first was blocking the event loop.)

Test Results 1. Koku Change

diff --git a/koku/api/provider/models.py b/koku/api/provider/models.py
index 390aebf..f18171a 100644
--- a/koku/api/provider/models.py
+++ b/koku/api/provider/models.py
@@ -83,7 +83,6 @@ class Provider(models.Model):
                                  on_delete=models.PROTECT)
     created_by = models.ForeignKey('User', null=True,
                                    on_delete=models.SET_NULL)
-    setup_complete = models.BooleanField(default=False)

 class ProviderStatus(models.Model):

2. Upload nice generated payload to upload service two times.

masu_listener    | [2019-06-12 02:16:50,921] INFO in kafka_msg_handler: Listener started.  Waiting for messages...
masu_listener    | [2019-06-12 02:16:55,804] INFO in kafka_msg_handler: Successfully extracted OCP for my-ocp-cluster-1/20190601-20190701
masu_listener    | [2019-06-12 02:16:55,820] INFO in kafka_msg_handler: Validating message: b'{"request_id": "52df9f748eabcfea", "validation": "success"}'
masu_listener    | [2019-06-12 02:16:56,765] INFO in kafka_msg_handler: Processing report for account {'authentication': 'my-ocp-cluster-1', 'customer_name': 'acct10001', 'billing_source': '', 'provider_type': 'OCP', 'schema_name': 'acct10001', 'provider_uuid': 'f60fbef4-7e0f-4d28-bb41-86be5607e752'}
masu_listener    | [2019-06-12 02:16:57,137] ERROR in kafka_msg_handler: Thread exception: 'api_provider' object has no attribute 'setup_complete'
masu_listener    | [2019-06-12 02:18:29,497] INFO in kafka_msg_handler: Successfully extracted OCP for my-ocp-cluster-1/20190601-20190701
masu_listener    | [2019-06-12 02:18:29,563] INFO in kafka_msg_handler: Validating message: b'{"request_id": "52df9f748eabcfea", "validation": "success"}'
masu_listener    | [2019-06-12 02:18:30,514] INFO in kafka_msg_handler: Processing report for account {'authentication': 'my-ocp-cluster-1', 'customer_name': 'acct10001', 'billing_source': '', 'provider_type': 'OCP', 'schema_name': 'acct10001', 'provider_uuid': 'f60fbef4-7e0f-4d28-bb41-86be5607e752'}
masu_listener    | [2019-06-12 02:18:30,803] ERROR in kafka_msg_handler: Thread exception: 'api_provider' object has no attribute 'setup_complete'
codecov[bot] commented 5 years ago

Codecov Report

:exclamation: No coverage uploaded for pull request base (master@fd3c521). Click here to learn what that means. The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##             master    #463   +/-   ##
========================================
  Coverage          ?   97.7%           
========================================
  Files             ?      67           
  Lines             ?    3742           
  Branches          ?     393           
========================================
  Hits              ?    3657           
  Misses            ?      50           
  Partials          ?      35
Impacted Files Coverage Δ
masu/external/kafka_msg_handler.py 100% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update fd3c521...71ebb2c. Read the comment docs.

dccurtis commented 5 years ago

Hitting a bad intermittent failure in test_populate_ocp_on_aws_cost_daily_summary. Will fix in a separate PR, but will probably continue to hit re-test for now to get this in :)