Closed sentry-io[bot] closed 1 year ago
@monsieurtanuki iOS was released during the night, and I've halted Android earlier this morning because of that. Per @alexgarel : impossible to upload images
A video capture… as you see it feels like it totally works from the user perspective, while it's not (you have to kill / restart the app to see your image were not sent…).
https://user-images.githubusercontent.com/144788/210987123-c3a93a0c-5f47-4448-9a6f-b6d2bdcf1c98.mp4
I'm on android 8.0.0 on a LDN-L21 device.
The same happens on iOS as well https://sentry.io/organizations/openfoodfacts/issues/3857655478/?project=5376745&query=is%3Aunresolved&referrer=issue-stream
The fact that it's cross platform issue makes an Android permission evolution unlikely. Could it be a permission issue ? Or a logic issue during the refactoring ?
as you see it feels like it totally works from the user perspective
It somehow works as expected: as we're dealing with background async uploads, we need to display to the user something that was not uploaded yet.
Without going into too much details, there are several layers of the same picture:
The problem with the bug is that the file seems to have been removed: it cannot be uploaded to the server, and it cannot be displayed as file. And that is transparent for the user as the most relevant picture is the one in memory. That also explains why when you restart the app you don't see your latest change: the memory has been cleared. And there's still no file. So you see the server version.
The question is: why can't the system see the file? My hunch is that the file is created in a "temporary" directory, which would have made sense if we immediately tried (and succeeded) to upload the file, but may be problematic is that directory is automatically periodically cleaned by the system.
I'll have a look at that right now.
Still there using 4.1.0+893-ml-play, see https://sentry.io/organizations/openfoodfacts/issues/3856182794/events/e7941075618c423badc26bb9d281c862/
Could it be, that the file is not found because it is not yet (fully) created (race condition ?)
@alexgarel I'm not an expert on Sentry, but it looks like it's the exact same issue:
Please have a look at dev mode / pending tasks: I suspect that it's the old task that is still there.
Good catch. I sent new photo, they were not sent to the server. Or could it be that the old task is blocking new ones ? (in this case we should maybe, make task die after a certain amount of retrying ?)
You are right I have a lot of tasks :-D
Can I kill them ?
A possible scenario:
It would mean:
There is still a version to test. I test asap.
Good catch. I sent new photo, they were not sent to the server. Or could it be that the old task is blocking new ones ? (in this case we should maybe, make task die after a certain amount of retrying ?)
The old task is blocking the new ones. You're asking a very good question: when and why should we give up with some tasks? And how do we notify the user? Not an easy question to answer. cf. https://github.com/openfoodfacts/smooth-app/issues/3263 So far I haven't met much interest about how to deal with that. Hopefully you'll bring some user-oriented answers and expectations.
You are right I have a lot of tasks :-D Can I kill them ?
Not yet.
I could PR today an ugly "kill that single task!" onPressed
method on the pending tasks page.
Ugly because it would just remove the task, not the image file, but that could be implemented later.
@alexgarel Interested?
we must avoid this kind of repetitive failure and remove a background task if it fails more than three times for example
Just to be on the safe side, we can catch exceptions on task execution and delete the task on the first fail itself, I guess this can be done for now, at least it won't be a long list of recursive fails in sentry.
- I turn off the app, so the crop file was removed
No, we don't remove the file when the app closes. That said, assuming that the temporary directory got cleaned meanwhile, if your app tries the next day to upload the file, indeed there's no file anymore. That would be the OS that removed the file, not the app. With the latest PR we use a different directory and we remove manually the file, at the end of the task.
Just to be on the safe side, we can catch exceptions on task execution and delete the task on the first fail itself, I guess this can be done for now, at least it won't be a long list of recursive fails in sentry.
@AshAman999 How do you alert the user? PS: I don't mind Sentry logs.
Specifically here, the problem is that one specific task will fail forever because of a deleted file, because of a previous bug. And the code fix can solve the future uploads, but not the ones coded with the previous bug.
We have - last time I checked - an exponential backoff system that was discussed during one of the weekly meetings:
@teolemon That was before my refactoring, where:
@monsieurtanuki right now we have a catastrophic case, because it completely blocks my app. I tested, it also block updates on other products.
If it happens to one of our users, he will think the application is utterly broken. So I think it's far better to kill a task and loose one piece of data than blocking every updates, and loosing everything.
The simplest:
The best, considering the update order is important only for a product:
@alexgarel The simplest: kill task after 1 attempt. This I can probably PR today. That means that offline upload will never work - btw neither with 3 attempts.
What you named the best solution isn't flawless either, but I have to work on my simplest solution PR first.
@monsieurtanuki we should distinguish between "failing because no network" (for offline mode it's ok) and "failing because of an unknown error" (my case).
failing because no network
So don't cancel the task if it's a socket exception, kinda like that ? else cancel if it's unknown
So don't cancel the task if it's a socket exception, kinda like that ? else cancel if it's unknown
yes
@monsieurtanuki I removed data from my app (cache removal did not remove tasks…) and now it's working.
So we are not in a big hurry, the release can be shipped tonight, the bug is important to fix, but as it may not happen systematically and it's already there since a long time, let's fix it latter.
@AshAman999 @alexgarel For the record the "no internet" exception was Failed host lookup: 'world.openfoodfacts.org'
.
With a slow internet it could be a time-out exception instead.
What you named the best solution isn't flawless either, but I have to work on my simplest solution PR first.
I agree that I didn't really have offline mode in my mind at the time.
But I think we can work on:
So we are not in a big hurry
Good!
Let's a talk about your suggestions:
- if a task fails - delay it according to a simple hard-coded schedule (10 min - 1h - 5h - 1d)
- continue processing tasks for other products
- if a task has failed for 1day - remove it
Following this, you don't take into account the order of the tasks. For instance:
'name1'
'name2'
'name2'
'name1'
'name2'
Question: are you OK with ignoring the edit order consistency?
Following this, the user entered data that may not be on the server eventually.
Question: are you OK with not informing the user about this?
Generally speaking, when I worked on that subject I thought that we should give more visibility to the user. Like a red asterisk on the product name if there are pending changes. An access to pending changes on a product would remove user frustration ("dammit, I entered the data yesterday, how come I cannot see it?") and would remove the need for hard-coded rules ("after 1d, we remove the task. Why? Don't ask!").
Another partial solution would be to create different queues: one per product changes, one per image uploads, one per product? Instead of one queue today.
While doing background sync, if something fails other than (cause of no internet) we can keep that in a db with the product barcode and changes and tryna show a notification whenever user comes back in the app something like "Some error occured while doing this upload/edit on [barcode]" in a sperate notification panel within app or system wide notification kind of thing or at least a snack bar
Kinda too ambitous i think, but can it be done not now but sometime later?
Similar to what one drive on windows do,if it fails to sync some files,it shows the error in notifications panel
@AshAman999 That's a good idea: let's answer the question "How do other apps do?".
At least we could enhance the "pending background tasks" page (and move it in a less exotic place):
Following this, you don't take into account the order of the tasks.
I did took it into account, because I said "continue updates for other products" (as consistency problem are only by product), but it seems maybe overkill to handle a list of task by product, while most of the time errors comes from being offline. This is where it seems important to me to first focus on "is this error a normal network error", to be able to decide what to do.
Question: are you OK with not informing the user about this?
I also think it's a good idea to inform user. That said, as a first step, I tend to think it's ok just telling him some updates fails (notification every time we decide to give up with a task ?).
Visible pending task list with errors might be a good idea though, but I tend to think we should have a default automatic mode.
Take all this as discussion with a grain of salt, I just want to help, I'm not a mobile dev, so I let you judge what is the best solution.
Take all this as discussion with a grain of salt, I just want to help, I'm not a mobile dev, so I let you judge what is the best solution.
@alexgarel Your opinion is as good as mine: you use the app. The implementation part is something else :)
Thinking again about it...
Each time we change a product, we do it for
And whatever we do for a different product and a different edit page is NOT related. Examples:
Therefore, we could just list the tasks and run them one by one, even if one failed just before. The only problem is: was there a task that failed before WITH THE SAME PRODUCT AND PAGE? If nothing failed before with the same key, I can go on. If something did fail before, I can just dismiss the failed task (that won't be retried) and go on with the new task, that anyway would overwrite the failed one. Special cases:
And the end of the day, we end up with a task queue full of tasks that failed and were not overwritten by something more relevant. But everything that has no reason to stop was processed.
And then we can think about "how to detect tasks that obviously will fail forever and how to tell the user we discarded them", but that's another story.
@monsieurtanuki kudos for the proposed fix in #3533 !
Thank you @alexgarel! The idea seems good, and I hope the implementation is good too.
I still think we should warn the user that the changes are not completely uploaded, e.g. with a red banner like the flutter debug banner, and give an easy access to what is pending. In another issue, then!
It could also be a simple "dot" (like when you got messages) on the settings tab, and in this case when you open the setting panel you directly get to the task management where a message tells you some updates were not pushed because of error and you could either dismiss them or tell to continue to try ?
@alexgarel Good idea! Something like that I guess:
cf. https://github.com/flutter/flutter/pull/114560 cf. https://api.flutter.dev/flutter/material/Badge-class.html
Sentry Issue: SMOOTHIE-1K6