aws-samples / aws-batch-operational-dashboards

https://aws.amazon.com/batch/
MIT No Attribution
11 stars 1 forks source link

Fetch error: 404 Not Found Instantiating; when trying to add Athena data source #31

Open uray-scalebio opened 4 months ago

uray-scalebio commented 4 months ago

Hello, When I try to add an Amazon Athena Data Source, I get this error below image I do not get prompted to add any information related to the Athena data source, as is shown in your tutorial. The moment I click on athena in the add data source page, it shows me "Data Source added" and I immediately get the error in the screenshot above. I've tried the Athena plugin version 2.13.5 and 2.14.0. I am running Grafana 9.4. Is Athena part of the Enterprise Plugins and is that why I'm getting this error?

mhuguesaws commented 4 months ago

Did you look at the response in #28 ?

uray-scalebio commented 4 months ago

Yes I did. I did clone the latest master so I have PluginAdminEnabled set to true in the yml file. I did follow the instructions from 1-5 to install Athena and the installation succeeds. The issue is with adding a Data Source.

mhuguesaws commented 4 months ago

I was able to reproduce the problem and it is related to manager grafana. I asked internally if there is any solutions. I was able to get the Amazon Athena plugin working with version 2.3.4

mhuguesaws commented 4 months ago

So apparently, this is because the plugin is not fully installed. It's suggested to wait up to 5 minutes to have the plugin fully installed and try to add a data source.

uray-scalebio commented 4 months ago

Thanks for the suggestion to wait 5 minutes. That worked and I was able to add an Athena data source. After doing so and executing the generate-grafana-dashboard.py script, and uploading the batch-grafana-dashboard.json file to the "Import dashboard" page on the grafana dasboard, I am getting the following error after accessing the newly created dashboard

error executing query: InvalidRequestException: 1 validation error detected: Value '' at 'workGroup' failed to satisfy constraint: Member must satisfy regular expression pattern: [a-zA-Z0-9._-]{1,128} { RespMetadata: { StatusCode: 400, RequestID: "ac76eb3f-813d-431d-a981-cfeba7f71260" }, AthenaErrorCode: "INVALID_INPUT", Message_: "1 validation error detected: Value '' at 'workGroup' failed to satisfy constraint: Member must satisfy regular expression pattern: [a-zA-Z0-9._-]{1,128}" }

Is there a null value of some sorts in one of the Athena Workgroup fields? We were not using Athena before this, so there might be some basic setting that I haven't turned on or such.

mhuguesaws commented 4 months ago

Did you run any jobs through AWS Batch? A job must go through AWS Batch and complete with Success or Failure to generate the field for Athena.

uray-scalebio commented 4 months ago

Yes I have container insights enabled for one ECS cluster, which corresponds to a queue on Batch. We have jobs regularly running on that queue. So could there be anything else causing the error?

mhuguesaws commented 4 months ago

Did you have a job completed? I am unable to reproduce the error you are observing.

uray-scalebio commented 4 months ago

When I inspect the query executed in the grafana dashboard on the athena datasource I see this query SELECT jobId, jobName, jobStatus, startedAt, stoppedAt, jobQueue, vCPUs, Memory, instancetype, instanceId, platform, purchaseOption, logStream, taskArn FROM \"$__table\" WHERE startedAt > 1711053092366 AND (cast(stoppedAt as decimal(38,9)) < 1711054892367 OR stoppedAt is NULL) AND REGEXP_LIKE(instanceId,'') AND REGEXP_LIKE(taskArn,) AND REGEXP_LIKE(jobQueue,);

However when I look at my table in Athena, it just has these three columns, jobid, jobstatus and stoppedat. I assume that's why there's an error. Because there aren't additional fields in the Athena table? Maybe I have to recreate it?

mhuguesaws commented 4 months ago

Can you check dynamodb to check if there is a started at field? Additional question, do you a trail in cloud trail? This is not setup by the template.

uray-scalebio commented 4 months ago

These are the columns in dynamodb jobId | jobStatus | stoppedAT

I initially didn't have container insights turned on for any of my ECS clusters. I did so for one cluster and then ran the dashboard creation python script. I printed out the clusterArn here https://github.com/aws-samples/aws-batch-operational-dashboards/blob/main/generate-grafana-dashboard.py#L71 and the cluster which had container insights enabled got printed out. And the tables in athena and dynamodb do have entries, they just don't seem to have all the columns. What could be the reason behind that? No we do not use cloudtrail

mhuguesaws commented 4 months ago

The lack of a trail in cloudtrail is probably the cause. I suggest that you create a trail and observe the step function execution logs.

I'll try to find an account without trail to confirm in a couple of days.

mhuguesaws commented 3 months ago

@uray-scalebio did the creation of a trail solve your challenges?

mhuguesaws commented 2 months ago

@uray-scalebio looking back at your message regarding your DynamoDB content. I suspect you are using memory and vcpus in containerProperties. That looks like this

"containerProperties": {
  "vcpus": 2
  "memory": 2048
}

Both of those are deprecated per documentation. Please use resourceRequirements instead such as :

    "containerProperties": {
        "resourceRequirements": [
            {
                "type": "MEMORY",
                "value": "2048"
            },
            {
                "type": "VCPU",
                "value": "2"
            }
        ]
    }