Closed harshad16 closed 1 year ago
Steps to reproduce the behavior:
Also directly via the API, e.g.
$ http 'https://khemenu.thoth-station.ninja/api/v1/python/package/version/metadata?name=pandas&version=1.4.2&index=https%3A%2F%2Fpypi.org%2Fsimple&os_name=fedora&os_version=34&python_version=3.9'
HTTP/1.1 500 INTERNAL SERVER ERROR
access-control-allow-origin: *
content-length: 399
content-type: application/json
date: Thu, 07 Jul 2022 10:45:50 GMT
server: gunicorn
set-cookie: 99770cb82864be05282857f803e02327=71283d09b941169601817f89b67f6781; path=/; HttpOnly; Secure; SameSite=None
x-thoth-search-ui-url: https://thoth-station.ninja/search/
x-thoth-version: 0.35.2
x-user-api-service-version: 0.35.2+messaging.0.16.1.storages.0.72.1.common.0.36.2.python.0.16.10
{
"error": "Solver document not found - solver documents are not in sync with database records, please contact administrator with the provided information: solver-fedora-34-py39-220403215308-caada45da50e5009",
"parameters": {
"index": "https://pypi.org/simple",
"name": "pandas",
"os_name": "fedora",
"os_version": "34",
"python_version": "3.9",
"version": "1.4.2"
}
}
/triage accepted /sig devsecops /priority important-soon
Job is indeed still running, and the previous instance appears to have the same issue (not completed properly) :
$ oc get jobs -n thoth-frontend-stage | grep document-sync
JOB Completed Duration Age
document-sync-27596160 0/1 34d 34d
document-sync-27619200 1/1 38m 17d
document-sync-27620640 0/1 17d 17d
document-sync-27640800 0/1 3d5h 3d5h
Moreover, the currently running pod had its container SIGKILLed this morning
...
Containers:
document-sync-job:
Container ID: cri-o://ff6bafd4184f07ea754324f90dccb3630fdd1102c18b06d1f3e3a287a4499d0e
Image: quay.io/thoth-station/document-sync-job:v0.1.0
Image ID: quay.io/thoth-station/document-sync-job@sha256:8661512d23891a5abe4596540dbd85a422166b09f2b4ea87ecc9a53231971ad9
Port: <none>
Host Port: <none>
State: Running
Started: Mon, 25 Jul 2022 05:10:32 +0200
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Fri, 22 Jul 2022 05:09:54 +0200
Finished: Mon, 25 Jul 2022 05:10:32 +0200
Ready: True
Restart Count: 1
Limits:
cpu: 1
memory: 2Gi
Requests:
cpu: 1
memory: 2Gi
Liveness: tcp-socket :80 delay=259200s timeout=1s period=10s #success=1 #failure=1
...
And the only logs lines are all :
$ oc logs document-sync-27640800--1-6cq4w -p --tail=10
2022-07-25 03:10:20,254 1 INFO thoth.document_sync:71: Document 'solver-fedora-34-py39-211112181556-e84a86006fca22f5' is already present
2022-07-25 03:10:21,520 1 INFO thoth.document_sync:71: Document 'solver-fedora-34-py39-211112181557-583d00fd0fea564a' is already present
2022-07-25 03:10:22,820 1 INFO thoth.document_sync:71: Document 'solver-fedora-34-py39-211112181557-7b01e5ca8aacdf9' is already present
2022-07-25 03:10:24,178 1 INFO thoth.document_sync:71: Document 'solver-fedora-34-py39-211112181557-8b6a1a2404d79458' is already present
2022-07-25 03:10:25,471 1 INFO thoth.document_sync:71: Document 'solver-fedora-34-py39-211112181558-20547678769a6ba2' is already present
2022-07-25 03:10:26,753 1 INFO thoth.document_sync:71: Document 'solver-fedora-34-py39-211112181558-58d6308a85a04fa5' is already present
2022-07-25 03:10:28,009 1 INFO thoth.document_sync:71: Document 'solver-fedora-34-py39-211112181558-70676d958fdd91fa' is already present
2022-07-25 03:10:29,274 1 INFO thoth.document_sync:71: Document 'solver-fedora-34-py39-211112181558-cc891b9496ed520c' is already present
2022-07-25 03:10:30,547 1 INFO thoth.document_sync:71: Document 'solver-fedora-34-py39-211112181559-33a40b62136dc964' is already present
2022-07-25 03:10:31,777 1 INFO thoth.document_sync:71: Document 'solver-fedora-34-py39-211112181559-858a32d827a67580' is already present
(this goes on for 3 days)
So it seems it's busy doing nothing but checking already present documents, getting killed (I'd say OOMkilled given the signal, but needs some metrics) and back to step one.
After the work and review , we feel it should be 5pt.
Describe the bug Document-sync job is not sycning all the documents.
Acceptance criteria
Additional notes: