Closed KociOrges closed 3 months ago
Submission event was not accepted
We are hitting OptimisticLockingFailureException
in mongo
Restarted ingest-core and initiated the submission -> 66a36163e8c73060882dcc20/1ce6cd2c-efc7-4963-9ab7-e157ee5039f9
ingest_core_logs.txt
We are hitting:
`2024-07-26 09:44:58
{"log":"2024-07-26 08:44:58.309 WARN 7 --- [nio-8080-exec-2] o.h.i.c.web.GlobalStateExceptionHandler : Handling ResourceNotFoundException and returning NOT_FOUND response\n","stream":"stdout","time":"2024-07-26T08:44:58.309583696Z"}
2024-07-26 09:44:58 {"log":"2024-07-26 08:44:58.308 WARN 7 --- [nio-8080-exec-2] o.h.i.c.web.GlobalStateExceptionHandler : Caught a resource not found exception argument at 'http://172.20.186.196/files/search/findByValidationJobValidationId'; this will generate a NOT_FOUND RESPONSE. Error message: Resource not found!\n","stream":"stdout","time":"2024-07-26T08:44:58.309246611Z"}
2024-07-26 09:44:52 {"log":"2024-07-26 08:44:52.231 WARN 7 --- [io-8080-exec-35] o.h.i.c.web.GlobalStateExceptionHandler : Handling ResourceNotFoundException and returning NOT_FOUND response\n","stream":"stdout","time":"2024-07-26T08:44:52.231909724Z"}
2024-07-26 09:44:52 {"log":"2024-07-26 08:44:52.231 WARN 7 --- [io-8080-exec-35] o.h.i.c.web.GlobalStateExceptionHandler : Caught a resource not found exception argument at 'http://172.20.186.196/files/search/findByValidationJobValidationId'; this will generate a NOT_FOUND RESPONSE. Error message: Resource not found!\n","stream":"stdout","time":"2024-07-26T08:44:52.231780723Z"}
2024-07-26 09:44:52 {"log":"2024-07-26 08:44:52.198 WARN 7 --- [nio-8080-exec-4] o.h.i.c.web.GlobalStateExceptionHandler : Handling ResourceNotFoundException and returning NOT_FOUND response\n","stream":"stdout","time":"2024-07-26T08:44:52.19895465Z"}
2024-07-26 09:44:52 {"log":"2024-07-26 08:44:52.198 WARN 7 --- [nio-8080-exec-4] o.h.i.c.web.GlobalStateExceptionHandler : Caught a resource not found exception argument at 'http://172.20.186.196/files/search/findByValidationJobValidationId'; this will generate a NOT_FOUND RESPONSE. Error message: Resource not found!\n","stream":"stdout","time":"2024-07-26T08:44:52.198481634Z"}
2024-07-26 09:42:27 {"log":"2024-07-26 08:42:27.210 WARN 7 --- [io-8080-exec-15] o.h.i.c.web.GlobalStateExceptionHandler : Attempt a failed save, likely due to multiple requests, at 'http://172.20.186.196/biomaterials/66a36168e8c73060882dcc24'; this will generate a CONFLICT RESPONSE\n","stream":"stdout","time":"2024-07-26T08:42:27.210816008Z"}
2024-07-26 09:42:27 {"log":"2024-07-26 08:42:27.114 WARN 7 --- [io-8080-exec-21] o.h.i.c.web.GlobalStateExceptionHandler : Attempt a failed save, likely due to multiple requests, at 'http://172.20.186.196/biomaterials/66a36168e8c73060882dcc24'; this will generate a CONFLICT RESPONSE\n","stream":"stdout","time":"2024-07-26T08:42:27.114793581Z"}
2024-07-26 09:42:26 {"log":"2024-07-26 08:42:26.715 INFO 7 --- [ntContainer#0-1] o.h.ingest.file.FileService : File validation state is DRAFT for file with cloudUrl s3://org-hca-data-archive-upload-staging/1ce6cd2c-efc7-4963-9ab7-e157ee5039f9/SRR3562314_1.fastq.gz and submission UUID 1ce6cd2c-efc7-4963-9ab7-e157ee5039f9 \n","stream":"stdout","time":"2024-07-26T08:42:26.715992344Z"}
2024-07-26 09:42:26 {"log":"2024-07-26 08:42:26.692 WARN 7 --- [nio-8080-exec-4] o.h.i.c.web.GlobalStateExceptionHandler : Attempt a failed save, likely due to multiple requests, at 'http://172.20.186.196/biomaterials/66a36168e8c73060882dcc24/validEvent'; this will generate a CONFLICT RESPONSE\n","stream":"stdout","time":"2024-07-26T08:42:26.692820328Z"}
2024-07-26 09:42:26 {"log":"2024-07-26 08:42:26.670 INFO 7 --- [ntContainer#0-1] o.h.ingest.file.FileService : Updating file with cloudUrl s3://org-hca-data-archive-upload-staging/1ce6cd2c-efc7-4963-9ab7-e157ee5039f9/SRR3562314_1.fastq.gz and submission UUID 1ce6cd2c-efc7-4963-9ab7-e157ee5039f9\n","stream":"stdout","time":"2024-07-26T08:42:26.671029382Z"}
2024-07-26 09:42:26 {"log":"2024-07-26 08:42:26.624 WARN 7 --- [io-8080-exec-44] o.h.i.c.web.GlobalStateExceptionHandler : Attempt a failed save, likely due to multiple requests, at 'http://172.20.186.196/biomaterials/66a36168e8c73060882dcc23/validEvent'; this will generate a CONFLICT RESPONSE\n","stream":"stdout","time":"2024-07-26T08:42:26.624878504Z"}`
We are still investigating the issue with the submission state change not being accepted from SUBMITTED to EXPORT. This is an issue only with staging. /cc @tburdett /cc @amnonkhen @amnonkhen, thanks for providing me the help to go into pods and check logs.
Ruled out both the above errors, CONFLICT and NOT_FOUND, they are non-issues
Added more logging in state tracker and did a new PR, revied by Amnon and merged to dev.
logs_comparison_dev_vs_staging.txt Lots of differences in guard while the same submission goes through in dev and staging.
Revised deployment list:
@amnonkhen, @tburdett, @gabsie - ingest-core and ingest-ui are the impacted projects for this change and they are deployed now to production.
Testing is successful, from submission -> validation -> graph validation -> submission event -> export -> complete
Test submissions:
Test jobs:
This ticket tracks the progress of updates for deploying Managed Access changes across several components.
Related ticket #967
See also ticket #1028 for manual submissions
Components to be updated:
Plan for each component:
1. integration-tests
2. ingest-core
3. ingest-graph-validator (potentially - see Additional Notes)
4. ingest-exporter
5. ingest-ui
6. state-tracking
7. staging-manager
Additional Notes: broker: It is just a test change, so not necessary. It can be pushed in the future. graph validator: The change is to use the authenticated ingest API calls. Not sure if it is actually necessary. Suggestion: rollback graph validator on dev and rerun a manual test on dev. validator: There is a managed access branch, but it has not been merged to dev because it is not needed; not necessary for the feature.