anchore / vunnel

Tool for collecting vulnerability data from various sources (used to build the grype database)
Apache License 2.0
74 stars 25 forks source link

Wolfi and Chainguard providers and not handling download errors as expected #515

Open dperry opened 6 months ago

dperry commented 6 months ago

What happened: I ran a feed service sync on Anchore Enterprise that uses vunnel to download provider vulnerability data. This was run with my internet disabled to see if each provider would error and mark the status as failed as it should.

Running:  
Completed:  chainguard wolfi
Failed:  anchore-match-exclusions sles mariner nvd rhel ubuntu oracle debian alpine amazon

However as you can see chainguard and wolfi returned their status as completed.

        {
            "driver_id": "chainguard",
            "end_time": "2024-03-14T10:14:09.651947+00:00",
            "parent_task_id": 139,
            "start_time": "2024-03-14T10:12:12.256996+00:00",
            "started_by": "system",
            "status": "completed",
            "task_id": 151,
            "task_type": "VunnelProviderExecutionTask"
        },
        {
            "driver_id": "wolfi",
            "end_time": "2024-03-14T10:12:12.255998+00:00",
            "parent_task_id": 139,
            "start_time": "2024-03-14T10:10:14.932096+00:00",
            "started_by": "system",
            "status": "completed",
            "task_id": 150,
            "task_type": "VunnelProviderExecutionTask"
        },
        ....
        {
            "driver_id": "sles",
            "end_time": "2024-03-14T10:09:55.441728+00:00",
            "parent_task_id": 139,
            "result": {
                "error": "The sles vunnel provider failed to complete succesfully.  Check the feed service logs for specific details of the failure.  If available, stale sles results will be used until the next successful run.",
                "error_details": "error: 1 error occurred:\n\t* failed to pull data from \"sles\" provider: command failed: 1\n\n",
                "failed_command": "grype-db pull -c /tmp/feeds_workspace/drivers/grypedb/grype-db.yaml -p sles"
            },
            "start_time": "2024-03-14T10:04:49.132466+00:00",
            "started_by": "system",
            "status": "failed",
            "task_id": 148,
            "task_type": "VunnelProviderExecutionTask"
        },
        ...

What you expected to happen:

I expected to see "status": "failed" due to the lack of internet connectivity to reach the provider service

And output similar to below

        {
            "driver_id": "sles",
            "end_time": "2024-03-14T10:09:55.441728+00:00",
            "parent_task_id": 139,
            "result": {
                "error": "The sles vunnel provider failed to complete succesfully.  Check the feed service logs for specific details of the failure.  If available, stale sles results will be used until the next successful run.",
                "error_details": "error: 1 error occurred:\n\t* failed to pull data from \"sles\" provider: command failed: 1\n\n",
                "failed_command": "grype-db pull -c /tmp/feeds_workspace/drivers/grypedb/grype-db.yaml -p sles"
            },
            "start_time": "2024-03-14T10:04:49.132466+00:00",
            "started_by": "system",
            "status": "failed",
            "task_id": 148,
            "task_type": "VunnelProviderExecutionTask"
        },

How to reproduce it (as minimally and precisely as possible):

Run vunnel with no internet, and see that the status returns 'completed'

Anything else we need to know?:

This might be due to how wolfi and chainguard both use the same endpoint This could be due to some aggressive try catch that swallows all errors around downloading https://github.com/anchore/vunnel/blob/main/src/vunnel/providers/wolfi/parser.py#L56

Environment:

tgerla commented 6 months ago

Hey @luhring or other Chainguard folks, are you able to take a look at this? Thanks!

westonsteimel commented 6 months ago

This isn't really an upstream issue, its just an issue on our end as we're swallowing all exceptions and shouldn't be. A failure to get data should result in the provider run failing

tgerla commented 6 months ago

Thanks @westonsteimel, I thought maybe since Dan contributed the original provider he might want to make the fix. :)

luhring commented 6 months ago

Just getting back from parental leave, but let me know if I can help with anything!