Closed fwinkelbauer closed 1 week ago
I can reproduce a different unique constraint error (UQE_package_s
) by running:
make test-pgsql\#TestPackageMavenConcurrent
with this test file:
package integration
import (
"fmt"
"net/http"
"strconv"
"strings"
"testing"
"code.gitea.io/gitea/models/unittest"
user_model "code.gitea.io/gitea/models/user"
"code.gitea.io/gitea/tests"
)
func TestPackageMavenConcurrent(t *testing.T) {
defer tests.PrepareTestEnv(t)()
user := unittest.AssertExistsAndLoadBean(t, &user_model.User{ID: 2})
groupID := "com.gitea"
artifactID := "test-project"
packageVersion := "1.0.1"
root := fmt.Sprintf("/api/packages/%s/maven/%s/%s", user.Name, strings.ReplaceAll(groupID, ".", "/"), artifactID)
putFile := func(t *testing.T, path, content string, expectedStatus int) {
req := NewRequestWithBody(t, "PUT", root+path, strings.NewReader(content)).
AddBasicAuth(user.Name)
MakeRequest(t, req, expectedStatus)
}
t.Run("Concurrent Upload", func(t *testing.T) {
defer tests.PrintCurrentTest(t)()
for i := 0; i < 10; i++ {
go putFile(t, fmt.Sprintf("/%s/%s.jar", packageVersion, strconv.Itoa(i)), "test", http.StatusCreated)
}
})
}
The test is based on tests/integration/api_packages_maven_test.go
but only uploads a few files at the same time. My local postgres database was created based on the database preparation documentation.
=== TestPackageMavenConcurrent/Concurrent_Upload (tests/integration/api_packages_fw_test.go:33) --- FAIL: TestPackageMavenConcurrent (7.14s) testlogger.go:60: 2024/04/02 13:25:11 ...xer/stats/indexer.go:87:populateRepoIndexer() [I] Done (re)populating the repo stats indexer with existing repositories testlogger.go:60: 2024/04/02 13:25:12 ...exer/code/indexer.go:312:populateRepoIndexer() [I] Done (re)populating the repo indexer with existing repositories testlogger.go:60: 2024/04/02 13:25:12 ...les/storage/local.go:33:NewLocalStorage() [I] Creating new Local Storage at /Users/fwinkelbauer/Projects/gitea/tests/gitea-lfs-meta --- FAIL: TestPackageMavenConcurrent/Concurrent_Upload (0.04s) testlogger.go:60: 2024/04/02 13:25:12 ...packages/packages.go:138:createPackageAndVersion() [E] Error inserting package: pq: duplicate key value violates unique constraint "UQE_package_s" testlogger.go:60: 2024/04/02 13:25:12 ...kages/maven/maven.go:50:apiError() [E] pq: duplicate key value violates unique constraint "UQE_package_s" testlogger.go:60: 2024/04/02 13:25:12 ...kages/maven/maven.go:53:func17() [E] testlogger.go:60: 2024/04/02 13:25:12 ...eb/routing/logger.go:102:func1() [I] router: completed PUT /api/packages/user2/maven/com/gitea/test-project/1.0.1/2.jar for test-mock:12345, 500 Internal Server Error in 7.0ms @ maven/maven.go:229(maven.UploadPackageFile) api_packages_fw_test.go:29: Error Trace: /Users/fwinkelbauer/Projects/gitea/tests/integration/integration_test.go:402 /Users/fwinkelbauer/Projects/gitea/tests/integration/api_packages_fw_test.go:29 /usr/local/Cellar/go/1.22.1/libexec/src/runtime/asm_amd64.s:1695 Error: Not equal: expected: 201 actual : 500 Test: TestPackageMavenConcurrent/Concurrent_Upload Messages: Request: PUT /api/packages/user2/maven/com/gitea/test-project/1.0.1/2.jar testlogger.go:60: 2024/04/02 13:25:12 ...eb/routing/logger.go:102:func1() [I] router: completed PUT /api/packages/user2/maven/com/gitea/test-project/1.0.1/5.jar for test-mock:12345, 201 Created in 13.3ms @ maven/maven.go:229(maven.UploadPackageFile) testlogger.go:60: 2024/04/02 13:25:12 ...eb/routing/logger.go:102:func1() [I] router: completed PUT /api/packages/user2/maven/com/gitea/test-project/1.0.1/1.jar for test-mock:12345, 201 Created in 13.5ms @ maven/maven.go:229(maven.UploadPackageFile) testlogger.go:60: 2024/04/02 13:25:12 ...eb/routing/logger.go:102:func1() [I] router: completed PUT /api/packages/user2/maven/com/gitea/test-project/1.0.1/8.jar for test-mock:12345, 201 Created in 16.5ms @ maven/maven.go:229(maven.UploadPackageFile) testlogger.go:60: 2024/04/02 13:25:12 ...eb/routing/logger.go:102:func1() [I] router: completed PUT /api/packages/user2/maven/com/gitea/test-project/1.0.1/4.jar for test-mock:12345, 201 Created in 16.6ms @ maven/maven.go:229(maven.UploadPackageFile) testlogger.go:60: 2024/04/02 13:25:12 ...eb/routing/logger.go:102:func1() [I] router: completed PUT /api/packages/user2/maven/com/gitea/test-project/1.0.1/7.jar for test-mock:12345, 201 Created in 18.1ms @ maven/maven.go:229(maven.UploadPackageFile) testlogger.go:60: 2024/04/02 13:25:12 ...eb/routing/logger.go:102:func1() [I] router: completed PUT /api/packages/user2/maven/com/gitea/test-project/1.0.1/0.jar for test-mock:12345, 201 Created in 18.3ms @ maven/maven.go:229(maven.UploadPackageFile) testlogger.go:60: 2024/04/02 13:25:12 ...eb/routing/logger.go:102:func1() [I] router: completed PUT /api/packages/user2/maven/com/gitea/test-project/1.0.1/3.jar for test-mock:12345, 201 Created in 18.3ms @ maven/maven.go:229(maven.UploadPackageFile) testlogger.go:60: 2024/04/02 13:25:12 ...eb/routing/logger.go:102:func1() [I] router: completed PUT /api/packages/user2/maven/com/gitea/test-project/1.0.1/6.jar for test-mock:12345, 201 Created in 18.3ms @ maven/maven.go:229(maven.UploadPackageFile) testlogger.go:60: 2024/04/02 13:25:12 ...eb/routing/logger.go:102:func1() [I] router: completed PUT /api/packages/user2/maven/com/gitea/test-project/1.0.1/9.jar for test-mock:12345, 201 Created in 18.9ms @ maven/maven.go:229(maven.UploadPackageFile) FAIL make: *** [test-pgsql#TestPackageMavenConcurrent] Error 1
Hope this helps!
Created a PR #30335 and put your tests there.
We observed the same or similar bug when migrating several TB of Maven artifacts to a Gitea Maven repository. We now have many packages that cannot be deleted, because the corresponding (S3?) BLOB cannot be found. So our question is: Even if this bug is hopefully fixed very soon, how can damaged packages be removed best from an existing repository?
I've created the new issue https://github.com/go-gitea/gitea/issues/30446 for what @FFock has mentioned above.
Hello! I created a Hotfix example which only allows to upload one package per type like maven at a time. This fix needs to be adapted, when using an other version than 1.21.10. I hope this helps someone:
https://github.com/tlusser/gitea/commit/02999c36180993de822250280413c4a96fe1c583
BR, Thorsten
PS: I also had to fix the test since it could have uploaded the same package twice, but it expected it to upload only once. (go routine variable capture issue)
@lunny : Do you think you code works with the updated test?
My code doesn't work for now because I think the table package_property needs some changes.
@tlusser your proposed hotfix helped, but I found out another problem that was not yet solved when pushing a high amount of submodules with a single mvn deploy.
Maven does uploads in parallel. When a POM and a JAR are uploaded for the same new version at the same point in time, it can happen that the package_version contains the string 'null' in metadata_json. This makes the package_version unusable and returns error 500 for queries of the API or UI.
gitea=# select * from package_version where id=343;
id | package_id | creator_id | version | lower_version | created_unix | is_internal | metadata_json | download_count
-----+------------+------------+---------------------+---------------------+--------------+-------------+---------------+----------------
343 | 121 | 84 | 23.07.1234-SNAPSHOT | 23.07.1234-snapshot | 1713453387 | f | null | 0
(1 row)
gitea=# select * from package_file where version_id=343;
id | version_id | blob_id | name | lower_name | composite_key | is_lead | created_unix
------+------------+---------+-------------------------------------------------------------+-------------------------------------------------------------+---------------+---------+--------------
2518 | 343 | 1770 | pwc-commons-kernel-23.07.1234-20240418.151520-1.jar | pwc-commons-kernel-23.07.1234-20240418.151520-1.jar | | f | 1713453387
2521 | 343 | 1773 | pwc-commons-kernel-23.07.1234-20240418.151520-1.pom | pwc-commons-kernel-23.07.1234-20240418.151520-1.pom | | t | 1713453387
2522 | 343 | 1774 | pwc-commons-kernel-23.07.1234-20240418.151520-1-sources.jar | pwc-commons-kernel-23.07.1234-20240418.151520-1-sources.jar | | f | 1713453387
(3 rows)
As you can see, created_unix is the same.
I could identify the problem and extend your hotfix to cover it.
There's a race condition in UploadPackageFile. The assumption of the original author is that mavendata_json is set during a POM file upload for an existing version or if the POM file is the first upload to the new version at createPackageAndVersion. But if the POM and JAR are uploaded simultaneously, both can enter CreatePackageOrAddFileToExisting . At the end, the metadata from the pom is lost, because the JAR file is a nano second ahead. I could fix it by pulling up your mutex and place before https://github.com/go-gitea/gitea/blob/main/routers/api/packages/maven/maven.go#L266, so basically before any calls to GetVersionByNameAndVersion.
This fixed the issue for us.
I think that just a global mutex on maven package upload function solves this issue. It would only allow one upload at a time (slower uploads), but ensure that packages are uploaded without race conditions. @proxity , @lunny : What do you think about?
I think that just a global mutex on maven package upload function solves this issue. It would only allow one upload at a time (slower uploads), but ensure that packages are uploaded without race conditions. @proxity , @lunny : What do you think about?
As a quick temporary fix, I think it's OK.
For everybody who needs this quick fix please see https://github.com/tlusser/gitea/commit/fb925d159c5c25396311643341e610df0dc100fc
So we don't have a fix included in 1.22.1? My fix could be improved to base the mutex on the package version that is going to be created in the db. That would not slow down other maven uploads which are creating different package versions.
I think we can send a PR to have a lock based on StatusTable
lock so that it will only lock the same package. But this should be a short-term resolution.
Thank you for providing a fix in next bugfix release. Will you also include checks in doctor command to clean invalid data out of the database?
Thank you for providing a fix in next bugfix release. Will you also include checks in doctor command to clean invalid data out of the database?
As I know, only the table package_property
may have dirty data because it has no unique index
. For other package
related tables, duplicated data should not be inserted because of unique index
. Do you find any other invalid data?
Hello, AFAIK there was also tables with missing references to other tables or missing blobs - but it is hard to remember, because i fixed it once in out postgres database and build my own version with a lock which was described in above comment. Afterwards everything seemed to be ok.
Are you sure, that there can not be any issues?
Fixed by #31954
Description
We believe that we've found a concurrency bug when doing maven deploys. Depending on network latency or some other factors,
mvn deploy
may succeed or fail (500 internal server error, see log below). Our project consists of three files:It seems to us that these files are uploaded in parallel which causes a unique constraint violation in our Postgres database when working with the
package_versions
table. We couldn't find any maven configuration where we could manipulate maven's upload/deploy behavior, so we currently are in a position where we can't circumvent the problem. We also tried to change Postgres' default isolation level from read_commited to serializable, but doing so causes a different 500 internal server error.journalctl:
After the failed upload, opening the artifact page throws another 500 which is caused by a nil pointer error (see screenshot).
Gitea Version
1.21.10
Can you reproduce the bug on the Gitea demo site?
No
Log Gist
No response
Screenshots
Git Version
No response
Operating System
Debian GNU/Linux 12 (bookworm)
How are you running Gitea?
We are using the official binary (manually downloaded) on a self hosted Debian server using systemd
Database
PostgreSQL