openfoodfacts / smooth-app

🤳🥫 The new Open Food Facts mobile application for Android and iOS, crafted with Flutter and Dart
https://world.openfoodfacts.org/open-food-facts-mobile-app?utm_source=off&utf_medium=web&utm_campaign=github-repo
Apache License 2.0
854 stars 282 forks source link

FileSystemException: FileSystemException: Cannot open file, path = '/data/user/0/org.openfoodfacts.scanner/cache/crop_... #3533

Closed sentry-io[bot] closed 1 year ago

sentry-io[bot] commented 1 year ago

Sentry Issue: SMOOTHIE-1K6

OSError: OS Error: No such file or directory, errno = 2
FileSystemException: FileSystemException: Cannot open file, path = '/data/user/0/org.openfoodfacts.scanner/cache/crop_165.jpeg' (OS Error: No such file or directory, errno = 2)
teolemon commented 1 year ago

@monsieurtanuki iOS was released during the night, and I've halted Android earlier this morning because of that. Per @alexgarel : impossible to upload images

alexgarel commented 1 year ago

A video capture… as you see it feels like it totally works from the user perspective, while it's not (you have to kill / restart the app to see your image were not sent…).

https://user-images.githubusercontent.com/144788/210987123-c3a93a0c-5f47-4448-9a6f-b6d2bdcf1c98.mp4

alexgarel commented 1 year ago

I'm on android 8.0.0 on a LDN-L21 device.

M123-dev commented 1 year ago

The same happens on iOS as well https://sentry.io/organizations/openfoodfacts/issues/3857655478/?project=5376745&query=is%3Aunresolved&referrer=issue-stream

teolemon commented 1 year ago

The fact that it's cross platform issue makes an Android permission evolution unlikely. Could it be a permission issue ? Or a logic issue during the refactoring ?

monsieurtanuki commented 1 year ago

as you see it feels like it totally works from the user perspective

It somehow works as expected: as we're dealing with background async uploads, we need to display to the user something that was not uploaded yet.

Without going into too much details, there are several layers of the same picture:

  1. the one uploaded to the server
  2. the one stored locally in a file
  3. the one stored in memory so that we get instant display

The problem with the bug is that the file seems to have been removed: it cannot be uploaded to the server, and it cannot be displayed as file. And that is transparent for the user as the most relevant picture is the one in memory. That also explains why when you restart the app you don't see your latest change: the memory has been cleared. And there's still no file. So you see the server version.

The question is: why can't the system see the file? My hunch is that the file is created in a "temporary" directory, which would have made sense if we immediately tried (and succeeded) to upload the file, but may be problematic is that directory is automatically periodically cleaned by the system.

I'll have a look at that right now.

alexgarel commented 1 year ago

Still there using 4.1.0+893-ml-play, see https://sentry.io/organizations/openfoodfacts/issues/3856182794/events/e7941075618c423badc26bb9d281c862/

Could it be, that the file is not found because it is not yet (fully) created (race condition ?)

monsieurtanuki commented 1 year ago

@alexgarel I'm not an expert on Sentry, but it looks like it's the exact same issue:

Please have a look at dev mode / pending tasks: I suspect that it's the old task that is still there.

alexgarel commented 1 year ago

Good catch. I sent new photo, they were not sent to the server. Or could it be that the old task is blocking new ones ? (in this case we should maybe, make task die after a certain amount of retrying ?)

alexgarel commented 1 year ago

You are right I have a lot of tasks :-D

Can I kill them ?

alexgarel commented 1 year ago

A possible scenario:

alexgarel commented 1 year ago

It would mean:

alexgarel commented 1 year ago

There is still a version to test. I test asap.

monsieurtanuki commented 1 year ago

Good catch. I sent new photo, they were not sent to the server. Or could it be that the old task is blocking new ones ? (in this case we should maybe, make task die after a certain amount of retrying ?)

The old task is blocking the new ones. You're asking a very good question: when and why should we give up with some tasks? And how do we notify the user? Not an easy question to answer. cf. https://github.com/openfoodfacts/smooth-app/issues/3263 So far I haven't met much interest about how to deal with that. Hopefully you'll bring some user-oriented answers and expectations.

teolemon commented 1 year ago
monsieurtanuki commented 1 year ago

You are right I have a lot of tasks :-D Can I kill them ?

Not yet. I could PR today an ugly "kill that single task!" onPressed method on the pending tasks page. Ugly because it would just remove the task, not the image file, but that could be implemented later. @alexgarel Interested?

AshAman999 commented 1 year ago

we must avoid this kind of repetitive failure and remove a background task if it fails more than three times for example

Just to be on the safe side, we can catch exceptions on task execution and delete the task on the first fail itself, I guess this can be done for now, at least it won't be a long list of recursive fails in sentry.

monsieurtanuki commented 1 year ago
  • I turn off the app, so the crop file was removed

No, we don't remove the file when the app closes. That said, assuming that the temporary directory got cleaned meanwhile, if your app tries the next day to upload the file, indeed there's no file anymore. That would be the OS that removed the file, not the app. With the latest PR we use a different directory and we remove manually the file, at the end of the task.

monsieurtanuki commented 1 year ago

Just to be on the safe side, we can catch exceptions on task execution and delete the task on the first fail itself, I guess this can be done for now, at least it won't be a long list of recursive fails in sentry.

@AshAman999 How do you alert the user? PS: I don't mind Sentry logs.

Specifically here, the problem is that one specific task will fail forever because of a deleted file, because of a previous bug. And the code fix can solve the future uploads, but not the ones coded with the previous bug.

monsieurtanuki commented 1 year ago

We have - last time I checked - an exponential backoff system that was discussed during one of the weekly meetings:

@teolemon That was before my refactoring, where:

alexgarel commented 1 year ago

@monsieurtanuki right now we have a catastrophic case, because it completely blocks my app. I tested, it also block updates on other products.

If it happens to one of our users, he will think the application is utterly broken. So I think it's far better to kill a task and loose one piece of data than blocking every updates, and loosing everything.

The simplest:

The best, considering the update order is important only for a product:

monsieurtanuki commented 1 year ago

@alexgarel The simplest: kill task after 1 attempt. This I can probably PR today. That means that offline upload will never work - btw neither with 3 attempts.

What you named the best solution isn't flawless either, but I have to work on my simplest solution PR first.

alexgarel commented 1 year ago

@monsieurtanuki we should distinguish between "failing because no network" (for offline mode it's ok) and "failing because of an unknown error" (my case).

AshAman999 commented 1 year ago

failing because no network

So don't cancel the task if it's a socket exception, kinda like that ? else cancel if it's unknown

alexgarel commented 1 year ago

So don't cancel the task if it's a socket exception, kinda like that ? else cancel if it's unknown

yes

alexgarel commented 1 year ago

@monsieurtanuki I removed data from my app (cache removal did not remove tasks…) and now it's working.

So we are not in a big hurry, the release can be shipped tonight, the bug is important to fix, but as it may not happen systematically and it's already there since a long time, let's fix it latter.

monsieurtanuki commented 1 year ago

@AshAman999 @alexgarel For the record the "no internet" exception was Failed host lookup: 'world.openfoodfacts.org'. With a slow internet it could be a time-out exception instead.

alexgarel commented 1 year ago

What you named the best solution isn't flawless either, but I have to work on my simplest solution PR first.

I agree that I didn't really have offline mode in my mind at the time.

But I think we can work on: 

monsieurtanuki commented 1 year ago

So we are not in a big hurry

Good!

Let's a talk about your suggestions:

  1. if a task fails - delay it according to a simple hard-coded schedule (10 min - 1h - 5h - 1d)
  2. continue processing tasks for other products
  3. if a task has failed for 1day - remove it

Following this, you don't take into account the order of the tasks. For instance:

Question: are you OK with ignoring the edit order consistency?

Following this, the user entered data that may not be on the server eventually.

Question: are you OK with not informing the user about this?

Generally speaking, when I worked on that subject I thought that we should give more visibility to the user. Like a red asterisk on the product name if there are pending changes. An access to pending changes on a product would remove user frustration ("dammit, I entered the data yesterday, how come I cannot see it?") and would remove the need for hard-coded rules ("after 1d, we remove the task. Why? Don't ask!").

Another partial solution would be to create different queues: one per product changes, one per image uploads, one per product? Instead of one queue today.

AshAman999 commented 1 year ago

While doing background sync, if something fails other than (cause of no internet) we can keep that in a db with the product barcode and changes and tryna show a notification whenever user comes back in the app something like "Some error occured while doing this upload/edit on [barcode]" in a sperate notification panel within app or system wide notification kind of thing or at least a snack bar

Kinda too ambitous i think, but can it be done not now but sometime later?

Similar to what one drive on windows do,if it fails to sync some files,it shows the error in notifications panel

monsieurtanuki commented 1 year ago

@AshAman999 That's a good idea: let's answer the question "How do other apps do?".

At least we could enhance the "pending background tasks" page (and move it in a less exotic place):

alexgarel commented 1 year ago

Following this, you don't take into account the order of the tasks.

I did took it into account, because I said "continue updates for other products" (as consistency problem are only by product), but it seems maybe overkill to handle a list of task by product, while most of the time errors comes from being offline. This is where it seems important to me to first focus on "is this error a normal network error", to be able to decide what to do.

Question: are you OK with not informing the user about this?

I also think it's a good idea to inform user. That said, as a first step, I tend to think it's ok just telling him some updates fails (notification every time we decide to give up with a task ?).

Visible pending task list with errors might be a good idea though, but I tend to think we should have a default automatic mode.

Take all this as discussion with a grain of salt, I just want to help, I'm not a mobile dev, so I let you judge what is the best solution.

monsieurtanuki commented 1 year ago

Take all this as discussion with a grain of salt, I just want to help, I'm not a mobile dev, so I let you judge what is the best solution.

@alexgarel Your opinion is as good as mine: you use the app. The implementation part is something else :)

Thinking again about it...

Each time we change a product, we do it for

And whatever we do for a different product and a different edit page is NOT related. Examples:

Therefore, we could just list the tasks and run them one by one, even if one failed just before. The only problem is: was there a task that failed before WITH THE SAME PRODUCT AND PAGE? If nothing failed before with the same key, I can go on. If something did fail before, I can just dismiss the failed task (that won't be retried) and go on with the new task, that anyway would overwrite the failed one. Special cases:

And the end of the day, we end up with a task queue full of tasks that failed and were not overwritten by something more relevant. But everything that has no reason to stop was processed.

And then we can think about "how to detect tasks that obviously will fail forever and how to tell the user we discarded them", but that's another story.

alexgarel commented 1 year ago

@monsieurtanuki kudos for the proposed fix in #3533 !

monsieurtanuki commented 1 year ago

Thank you @alexgarel! The idea seems good, and I hope the implementation is good too.

I still think we should warn the user that the changes are not completely uploaded, e.g. with a red banner like the flutter debug banner, and give an easy access to what is pending. In another issue, then! Capture d’écran 2023-01-09 à 19 26 35

alexgarel commented 1 year ago

It could also be a simple "dot" (like when you got messages) on the settings tab, and in this case when you open the setting panel you directly get to the task management where a message tells you some updates were not pushed because of error and you could either dismiss them or tell to continue to try ?

monsieurtanuki commented 1 year ago

@alexgarel Good idea! Something like that I guess: Capture d’écran 2023-01-10 à 14 21 44

cf. https://github.com/flutter/flutter/pull/114560 cf. https://api.flutter.dev/flutter/material/Badge-class.html