Closed achave11-ucsc closed 10 months ago
@hannes-ucsc: 'Spike to check if this warning still occurs in production'
The unexpected warning still occurs in production. With the most recent instance being on 08/13/2021
@timestamp | @message |
---|---|
2021-08-13 21:01:19.065 | [WARNING] 2021-08-13T21:01:19.065Z 724b8ab9-929f-4509-97b1-f61dba7151e5 Manifest object 'manifests/e897fe08-7ea1-5cbe-b1b0-4ef258fa3f59.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-08-13 21:01:07.612 | [WARNING] 2021-08-13T21:01:07.612Z b500de8e-8cb0-4e9e-9098-8d875d73b4c3 Manifest object 'manifests/ad6c80bb-fc94-52c0-be9a-aa5b3200bbc7.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-08-13 14:28:23.213 | [WARNING] 2021-08-13T14:28:23.213Z 8d212d85-eaf3-4af4-b45c-bab2ed402c77 Manifest object 'manifests/df24be44-ced6-5721-8afb-dd99bc1be816.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-08-13 02:16:06.735 | [WARNING] 2021-08-13T02:16:06.735Z 3c207268-10f8-472b-a3c4-b8a1071892b3 Manifest object 'manifests/0259e701-bc47-5ed0-a870-8ac588ece711.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-08-13 02:15:54.872 | [WARNING] 2021-08-13T02:15:54.872Z 37bd3078-8c05-444e-9341-0bd0e7b6d867 Manifest object 'manifests/20cf880c-9713-5946-8166-1813909e723d.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-08-10 02:42:33.972 | [WARNING] 2021-08-10T02:42:33.972Z 913a2a44-bc94-4fef-a255-cfd9f0c1488a Manifest object 'manifests/40ee009a-4508-5559-b86a-c734b2528631.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-08-10 02:42:20.000 | [WARNING] 2021-08-10T02:42:20.000Z 62d913a8-b195-4329-ba49-bbdacb85039b Manifest object 'manifests/db288911-209e-59b1-81b1-85bfa8048424.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-08-09 20:58:25.117 | [WARNING] 2021-08-09T20:58:25.116Z 692da111-055a-4fcf-950d-8343c0c9ab91 Manifest object 'manifests/26d92509-44c6-5f43-a0e6-5dc9443a76e9.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-08-05 02:17:41.157 | [WARNING] 2021-08-05T02:17:41.157Z 48339d76-988a-47dd-8b1a-f6abffe8e1de Manifest object 'manifests/d166a632-50ed-5817-92d8-008b4767343c.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-07-31 01:27:58.993 | [WARNING] 2021-07-31T01:27:58.993Z 2f00def4-b9b6-4217-83bc-6d78a0d2ed50 Manifest object 'manifests/1b9c0ddc-b824-5f13-b881-6d63dfe60ab2.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-07-31 01:27:57.730 | [WARNING] 2021-07-31T01:27:57.730Z 46646876-a56f-4b4d-b0f8-82a1566a32cd Manifest object 'manifests/66cf5eaa-2d7b-512e-82c7-c2fca499f04d.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-07-22 19:19:20.658 | [WARNING] 2021-07-22T19:19:20.658Z 403e558f-1af5-45a2-ade7-b075b59ab7b5 Manifest object 'manifests/d13d64d4-4d39-5664-9f1b-2cde81f0865b.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-07-22 19:19:05.713 | [WARNING] 2021-07-22T19:19:05.712Z c2d5a6cb-02eb-4e48-a95b-190612a67ef2 Manifest object 'manifests/1078fe26-5318-5259-b05b-af0fe98b515e.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-07-22 18:18:41.018 | [WARNING] 2021-07-22T18:18:41.018Z d0442271-254f-498d-b159-476453076a89 Manifest object 'manifests/49110c6f-cfee-5e95-b7b1-cdd0199bd976.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-07-22 18:18:38.378 | [WARNING] 2021-07-22T18:18:38.377Z abb71add-64f5-42e0-978c-f85e71b9c7f5 Manifest object 'manifests/92bb575e-6839-5bea-9f92-54c74b9d4376.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-07-22 18:18:22.045 | [WARNING] 2021-07-22T18:18:22.044Z 17313844-4c9e-480d-8d59-0a10c5f9a9f1 Manifest object 'manifests/b7fedbf8-2f58-54f8-b12f-3db6abf8f403.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-07-21 07:15:38.394 | [WARNING] 2021-07-21T07:15:38.394Z 4c56487a-a9b8-4d17-94d1-a1a57abad851 Manifest object 'manifests/b560250f-9b6d-5e20-adf3-7fc9f361740c.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-07-21 03:35:02.547 | [WARNING] 2021-07-21T03:35:02.546Z 7af9d64d-ab9d-4a75-bee9-6b34775b7e5d Manifest object 'manifests/0195e468-4f62-51b8-98c0-cbb89d13012f.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-07-21 03:34:48.380 | [WARNING] 2021-07-21T03:34:48.380Z b7ec2c6e-4caa-4225-9f80-bc5d89f1ec6a Manifest object 'manifests/663d575b-abb3-5c6f-a188-3299bc9f6fbd.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-07-16 23:46:34.256 | [WARNING] 2021-07-16T23:46:34.255Z d574d52e-1c02-4432-9917-ffcf0ca02185 Manifest object 'manifests/fa363915-ec27-534f-85be-97cd84497328.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
@achave11 to investigate one particular occurrence of this. Specifically, the log messages from all lambda invocations that are involved in that occurrence. Also, determine if tagging S3 objects is strongly consistent according to the AWS documentation.
The object key manifests/97c5ed54-85be-506b-ad27-6a70f60eb5df.tsv
produced the …
021-08-31T20:43:48.215-07:00 [WARNING] Manifest object … doesn't have the 'azul_file_name' tag.
… log message, indicating that the file name tag wasn't set. The following step function invocations were associated with the observed object key …
769acd57-f45c-4b4e-8d4c-34df44daf377 Start: 2021-08-31 20:43:46.717 End: 2021-08-31 20:43:48.320
035e9f21-ae00-410f-8039-ebe2710d1710 Start: 2021-08-31 20:43:47.864 End: 2021-08-31 20:43:48.343
… both having as input the following:
{
"format_": "full",
"catalog": "it2",
"filters": {},
"object_key": "manifests/97c5ed54-85be-506b-ad27-6a70f60eb5df.tsv",
"partition": {
"index": 0,
"is_last": false,
"file_name": null,
"config": null,
"multipart_upload_id": null,
"part_etags": null,
"page_index": null,
"is_last_page": null,
"search_after": null
}
}
It's immediately apparent that the Step Function execution starting with 035e9f21
was started 0.456s before the end of the 769acd57
(first) execution. Furthermore, it can be determined that the second SF execution returns a cached manifest, which would usually have been returned by the manifest request to the service if available sooner, avoiding an additional SF execution request. This can possibly be explained by the first SF execution having completed the multipart upload to S3 of the manifest but not having completed the file name tagging step. So when the second SF execution finds a cache manifest object, the first has not completed it's execution sequence, namely the file name tagging process.
Logs for both executions:
@timestamp | @message |
---|---|
2021-09-01 03:43:46.912 | START RequestId: c9a0bbf1-7293-426b-b830-aa42acc4d25c Version: $LATEST |
2021-09-01 03:43:46.927 | [INFO] 2021-09-01T03:43:46.926Z c9a0bbf1-7293-426b-b830-aa42acc4d25c Cached manifest not found: manifests/97c5ed54-85be-506b-ad27-6a70f60eb5df.tsv |
2021-09-01 03:43:46.953 | [INFO] 2021-09-01T03:43:46.953Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.025s] |
2021-09-01 03:43:46.980 | [INFO] 2021-09-01T03:43:46.979Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.026s] |
2021-09-01 03:43:47.055 | [INFO] 2021-09-01T03:43:47.054Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.074s] |
2021-09-01 03:43:47.082 | [INFO] 2021-09-01T03:43:47.082Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.027s] |
2021-09-01 03:43:47.109 | [INFO] 2021-09-01T03:43:47.109Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.026s] |
2021-09-01 03:43:47.135 | [INFO] 2021-09-01T03:43:47.135Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.026s] |
2021-09-01 03:43:47.161 | [INFO] 2021-09-01T03:43:47.161Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.026s] |
2021-09-01 03:43:47.188 | [INFO] 2021-09-01T03:43:47.187Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.025s] |
2021-09-01 03:43:47.222 | [INFO] 2021-09-01T03:43:47.221Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.033s] |
2021-09-01 03:43:47.249 | [INFO] 2021-09-01T03:43:47.248Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.026s] |
2021-09-01 03:43:47.275 | [INFO] 2021-09-01T03:43:47.275Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.025s] |
2021-09-01 03:43:47.302 | [INFO] 2021-09-01T03:43:47.301Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.026s] |
2021-09-01 03:43:47.329 | [INFO] 2021-09-01T03:43:47.329Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.027s] |
2021-09-01 03:43:47.412 | [INFO] 2021-09-01T03:43:47.411Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.081s] |
2021-09-01 03:43:47.438 | [INFO] 2021-09-01T03:43:47.438Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.026s] |
2021-09-01 03:43:47.464 | [INFO] 2021-09-01T03:43:47.464Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.025s] |
2021-09-01 03:43:47.464 | [INFO] 2021-09-01T03:43:47.464Z c9a0bbf1-7293-426b-b830-aa42acc4d25c Elasticsearch partitioned requests completed after 0.537s |
2021-09-01 03:43:47.526 | [INFO] 2021-09-01T03:43:47.526Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.033s] |
2021-09-01 03:43:47.566 | [INFO] 2021-09-01T03:43:47.566Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.033s] |
2021-09-01 03:43:47.608 | [INFO] 2021-09-01T03:43:47.608Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.035s] |
2021-09-01 03:43:47.648 | [INFO] 2021-09-01T03:43:47.648Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.034s] |
2021-09-01 03:43:47.691 | [INFO] 2021-09-01T03:43:47.690Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.036s] |
2021-09-01 03:43:47.751 | [INFO] 2021-09-01T03:43:47.751Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.053s] |
2021-09-01 03:43:47.802 | [INFO] 2021-09-01T03:43:47.802Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.042s] |
2021-09-01 03:43:47.831 | [INFO] 2021-09-01T03:43:47.831Z c9a0bbf1-7293-426b-b830-aa42acc4d25c GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_it2_bundles_aggregate/_search [status:200 request:0.026s] |
2021-09-01 03:43:48.067 | START RequestId: 29b09a23-d0e4-46dd-86df-686fd3813c73 Version: $LATEST |
2021-09-01 03:43:48.151 | [DEBUG] 2021-09-01T03:43:48.151Z 29b09a23-d0e4-46dd-86df-686fd3813c73 Manifest object expires in 677771.848932 seconds, on 2021-09-09 00:00:00+00:00 |
2021-09-01 03:43:48.196 | END RequestId: c9a0bbf1-7293-426b-b830-aa42acc4d25c |
2021-09-01 03:43:48.196 | REPORT RequestId: c9a0bbf1-7293-426b-b830-aa42acc4d25c Duration: 1281.01 ms Billed Duration: 1282 ms Memory Size: 1024 MB Max Memory Used: 157 MB |
2021-09-01 03:43:48.215 | [WARNING] 2021-09-01T03:43:48.215Z 29b09a23-d0e4-46dd-86df-686fd3813c73 Manifest object 'manifests/97c5ed54-85be-506b-ad27-6a70f60eb5df.tsv' doesn't have the 'azul_file_name' tag. Generating pre-signed URL without Content-Disposition header. |
2021-09-01 03:43:48.217 | END RequestId: 29b09a23-d0e4-46dd-86df-686fd3813c73 |
2021-09-01 03:43:48.217 | REPORT RequestId: 29b09a23-d0e4-46dd-86df-686fd3813c73 Duration: 146.01 ms Billed Duration: 147 ms Memory Size: 1024 MB Max Memory Used: 131 MB |
Edit: Similar time overlap with SF executions in quick succession is observed in other manifest generations that warn of the missing file name tag.
As per AWS S3 strong consistency regarding tags, under the S3 is Now Strongly Consistent
heading, the following is stated:
Effective immediately, all S3 GET, PUT, and LIST operations, as well as operations that change object tags,
ACLs, or metadata, are now strongly consistent.
@hannes-ucsc : "I agree with the above analysis. The resolution for #3271 prevents multiple concurrent step function executions per manifest object, allowing for a tightening from just logging a warning to raising an exception."
prod
.The warning no longer occurs in production. The last instance of this warning was on 11/04/2023.
CloudWatch Logs Insights)
region: us-east-1
log-group-names: /aws/lambda/azul-service-prod-manifest
start-time: -2419200s
end-time: 0s
query-string:
fields @timestamp, @message, @logStream, @log
| filter @message like "Manifest object"
| filter @message like "does not have the"
| sort @timestamp desc
| limit 20
@timestamp | @message | @logStream | @log |
---|---|---|---|
2023-11-04 16:40:12.908 | [WARNING] 2023-11-04T16:40:12.908Z d7b435ab-bceb-401c-9c07-3f3bd256dfa9 azul.service.manifest_service Manifest object 'manifests/85842272-3c47-5971-9a5f-08d7c7c65c1f.4ba44e80-fe08-5a78-a952-966dbc5520c6.tsv' does not have the 'azul_file_name' tag. | 2023/11/04/[$LATEST]920460316e354a88a37ad004068fedde | 542754589326:/aws/lambda/azul-service-prod-manifest |
Assignee to continue spike to show that rate of incidence has actually decreased.
After surveying 12 weeks of log data and having observed at least one occurrence per week in the nine weeks prior, no matching logs are found for the most recent three weeks since the last log on 11/04/2023 (week starting on 2023-10-29).
CloudWatch Logs Insights
region: us-east-1
log-group-names: /aws/lambda/azul-service-prod-manifest
start-time: 2023-09-05T15:00:00.000Z
end-time: 2023-11-28T19:00:00.000Z
query-string:
fields @timestamp, @message, @log
| filter @message like "Manifest object"
| filter @message like "does not have the"
| sort @timestamp asc
| stats count(*) as occurrances_per_week by bin(1w) as week_starting_on
week_starting_on | occurrances_per_week |
---|---|
2023-10-29 00:00:00.000 | 1 |
2023-10-22 00:00:00.000 | 2 |
2023-10-15 00:00:00.000 | 2 |
2023-10-08 00:00:00.000 | 1 |
2023-10-01 00:00:00.000 | 2 |
2023-09-24 00:00:00.000 | 1 |
2023-09-17 00:00:00.000 | 2 |
2023-09-10 00:00:00.000 | 4 |
2023-09-03 00:00:00.000 | 2 |
@hannes-ucsc: "I want to wait another two weeks and then redo the diagram above."
Latest survey for the matches of the log message, Manifest object * doesn't have the * tag
, in the logs hasn't change. The last matching log was observed on 2023-11-04.
CloudWatch Logs Insights
region: us-east-1
log-group-names: /aws/lambda/azul-service-prod-manifest
start-time: -7257600s
end-time: 0s
query-string:
fields @timestamp, @message, @log
| filter @message like "Manifest object"
| filter @message like "does not have the"
| sort @timestamp asc
| stats count(*) as occurrances_per_week by bin(1w) as week_starting_on
week_starting_on | occurrances_per_week |
---|---|
2023-10-29 00:00:00.000 | 1 |
2023-10-22 00:00:00.000 | 2 |
2023-10-15 00:00:00.000 | 2 |
2023-10-08 00:00:00.000 | 1 |
2023-10-01 00:00:00.000 | 2 |
2023-09-24 00:00:00.000 | 1 |
@hannes-ucsc: "Looks good. It does show that the warnings have stopped occurring. Assignee to convert the log warning into an assertion."
This https://github.com/DataBiosphere/azul/blob/c4ff9e4b41b430003a19084a899c1bce247fd86f/src/azul/service/manifest_service.py#L439 should not be a warning, it should be an error. We don't think it can actually occur anymore since S3 became strongly consistent.