knadh / listmonk

High performance, self-hosted, newsletter and mailing list manager with a modern dashboard. Single binary app.
https://listmonk.app
GNU Affero General Public License v3.0
14.54k stars 1.33k forks source link

Unique media name in S3 #1957

Closed julienvienne closed 1 month ago

julienvienne commented 1 month ago

Version:

Description of the bug Hello, When a new media is uploaded, a new id is created in database and the file is uploaded on S3. However, if the media filename already exists on S3, the existing file is simply overwrited and we have two records in database, pointing on the same file.

For that reason, on client side, I created a check method that removes all media entries in database that have the given target filename before uploading a new one. For testing this method :

Now here is the bug: after exactly 99 media entries removed successfully, I get the following HTTP 500 for the 100th :

{'message': 'Error creating Media: sql: no rows in result set'}

If I retry with new deletion loop, it goes well up to 100 entries again.

How you can reproduce

Using listmonk-api python client :


# Listmonk server
username = "user"
password = "password"
listmonk_api_url = "http://localhost:9000"
client = listmonk_api.Api(url=listmonk_api_url, username=username, password=password)

# Creating 200 media entries for the same file
attachment = "/path/to/testfile.txt"
media_ids = []
for i in range(200):
     media_infos = client.upload_media(file=attachement)
     media_ids.append(["data"]["id"])

# Now deleting the created entries
for media_id in media_ids:
    response = client.delete_media(media_id)
    if not response.get("data", False):                       # <- Fails exatly after 100 deletions
        print(f"Fail to delete media {idx} : {media_id}")
        print(response)
        exit(1)
exit(0)
STAR-173 commented 1 month ago

Hi @julienvienne, I was looking into the issue and have a question.

For that reason, on client side, I created a check method that removes all media entries in database that have the given target filename before uploading a new one.

In the above part, You mentioned that it deletes any previous entries if the particular filename already exist in the DB. Doesn't that mean that at any given point in time, there will only be one entry for one file name ?

knadh commented 1 month ago

The filesystem provider has a assertUniqueFilename() that checks whether the filename being uploaded already exists on disk, and if it does, adds a small random hash to the filename to make it unique. The s3 provider is missing this. It should ideally use the simples3.FileDetails() call to verify whether the file already exists and apply the same random hash logic as the filesystem provider before uploading.

julienvienne commented 1 month ago

That sounds great! Thank you for fixing this isssue so rapidly. It will probably avoid my workaround indeed :)