OpenArchive / Save-app-android-old

This is the Save app for Android
https://open-archive.org
GNU General Public License v3.0
93 stars 26 forks source link

Upload Service: Internet Archive uploads multiple copies #599

Closed ryjen closed 1 month ago

ryjen commented 3 months ago

Describe the bug When uploading a file to internet archive, if you confirm the upload by visiting your https://archive.org account, duplicate files are uploaded.

To Reproduce Steps to reproduce the behavior:

  1. Log into or setup an Internet Archive backend
  2. Click on the '+' and/or create a folder
  3. Select a single image to upload
  4. upload the file
  5. wait 10-30 minutes for the internet archive processes to complete archiving
  6. log into your internet archive account and go to 'my uploads'
  7. See error

Expected behavior A single copy of the image, related metadata, and proof

Actual behaviour

multiple copies

Screenshots Screenshot 2024-03-20 at 5 35 26 PM

Environment (please complete the following information):

Additional context

The problem could be two things:

  1. an error or lifecycle issues with the upload service, causing multiple uploads of the same file
  2. the identifier for the upload changes and IA treats as separate files
vanichitkara commented 3 months ago

I'm still getting the same issue as described above. @ryjen @foundscapes Screenshot_2024-03-27-23-06-16-12_40deb401b9ffe8e1df2f1cc5ba480b12

OS: Android 14 Device: OnePlus 11R version: 0.3.3

ryjen commented 3 months ago

@vanichitkara Can you confirm those are not dups from testing with the same image? If you click into them do they have the same identifier?

Screenshot 2024-03-27 at 10 42 56 AM

vanichitkara commented 3 months ago

@ryjen They have different identifiers. These are the following identifiers for the multiple copies of the same image: IMG20240309104945-jpg-rtboz IMG20240309104945-jpg-lyek IMG20240309104945-jpg-1uy2 IMG20240309104945-jpg-npcl

The picture with this image name was uploaded 4 times, and this associated file also came up four times. The picture that was intended to upload also has 4 different identifiers associated with each of them.

ryjen commented 3 months ago

@vanichitkara Do you know if any errors occured to achieve the duplicates?

I suspect the uploader is re-running after an error and creating a new identifier now.

It is still better than my previous testing where there was 10+

vanichitkara commented 3 months ago

@ryjen I'll need to test again to see if upload retry is the culprit of these multiple uploads. Give me some time, I'll get back to you tomorrow for this.

vanichitkara commented 3 months ago

@ryjen I tested if the duplicates are coming only for errors while uploading or not. According to my tests, the duplicates came in regardless of whether the uploads had to be retried or not. 4 of my 5 images went in without retry and 5th one had to be retried, but all these images have their duplicates created.

foundscapes commented 3 months ago

@vanichitkara this is working for me (no dupes), can you retest this weekend and close if it's fixed for you? ty!

vanichitkara commented 3 months ago

@foundscapes, I still see duplicate images after uploading. I uploaded these pictures today to test the teal % and spinner. Phone: OnePlus 11R OS: Android 14 Save version: 0.4.0

Screenshot_2024-04-07-16-51-40-43_40deb401b9ffe8e1df2f1cc5ba480b12

ryjen commented 3 months ago

Keep in mind, yes, it is still possible to upload the same picture more than once if you intended to.

Based on my testing, the picture evidence is still a vast improvement from the background processes uploading unintentionally (it got quite spammy)

@rapuckett

You might perhaps need to examine the requirements for IA needing a unique filename with randomness generated

a) As to whether it was to bypass errors

Or b) because it has something to do with needing an identifier for the API

In the likely case of b) you may need to design a better folder structure as an identifier or some other method instead of randomness to truly stop intentional and unintentional duplicates.

foundscapes commented 1 month ago

I have not had this issue in over a month