bradyt / taskw-dart

Taskwarrior-inspired mobile todo app
82 stars 8 forks source link

`FormatException: Missing extension byte` when trying to sync. #27

Closed jdbosser closed 2 years ago

jdbosser commented 2 years ago

Hi! I am new to creating issues on Github. Please be kind :) I am willing to help you as much as I can in trying to solve this problem, and I will try to give you as much information that I think is relevant for you.

Some information:

I recently tried to sync with my taskserver over at freecinc. It seems like the syncing should be working okay, given the information that is displayed on the stats screen.

Screenshot_20220623-190945_task

Screenshot_20220623-191011_task

As I try to sync, I am greeted with this error:

FormatException: Missing extension byte (at offset 109920)

#0      _Utf8Decoder.convertSingle (dart:convert-patch/convert_patch.dart:1755)
#1      Utf8Decoder.convert (dart:convert/utf.dart:351)
#2      Utf8Codec.decode (dart:convert/utf.dart:63)
#3      Codec.decode (package:taskc/src/taskc/impl/codec.dart:17)
#4      TaskdClient.request (package:taskc/src/home/impl/taskd_client.dart:104)
<asynchronous suspension>
#5      Home.synchronize (package:taskc/src/home/home.dart:41)
<asynchronous suspension>
#6      _StorageWidgetState.synchronize (package:task/src/widgets/storage_widget.dart:246)
<asynchronous suspension>

Note that syncing works fine with Foreground, taskwarrior (Linux), and on taskwc2. So I think there is nothing wrong on the server side.

The error message is rather short. Any way I can provide you with more information?

jdbosser commented 2 years ago

Is it related to UTF8 encoding? I am swedish, by the way, which may be relevant. I think I have some tasks that contain åäö.

bradyt commented 2 years ago

Hello @jdbosser, thank you for trying the app, and reporting the issue. I tried task add foo åäö with a FreeCinc account, and was able to sync to taskw-dart successfully. I hope we can find the information needed to create a minimal reproducible example. I do have some notes on cutting data in half at https://github.com/bradyt/taskw-dart/wiki/Draft-of-CONTRIBUTING.md#bisecting.

I realize you might not be setup to troubleshoot data with a self-hosted or local Taskserver, so if you have a way to anonymize the data so nothing is sensitive there, maybe we can chat about this on Matrix, let me know if I should ping you there. Or Libera IRC's #taskwarrior is another option.

bradyt commented 2 years ago

If you don't want to self-host a remote or local Taskserver, but want to try to bisect your data, you might try https://inthe.am, as I think they have a way to reset the data on a given account. This could facilitate iterating on cutting your local data in half, resetting your inthe.am account, then syncing this data up to inthe.am then back down to taskw-dart, to see if error persists.

Then, bisecting your data like this, you would in very few steps, potentially be able to reproduce the issue with a single task, and a report that I can more easily address.

bradyt commented 2 years ago

I don't know a whole lot about the algorithm that gets us to missing extension byte. But some of the internet search results give me a vague idea that something is splitting inconveniently on a non-ascii character. So if the reproduce possibly requires a large set of data, as well as non-ascii characters, one idea might be to anonymize the data with that in mind. Possibly replacing all ascii characters in description and annotation strings with the letter "a", but not replacing any non-ascii characters, as I'm not sure if we need some aspect of them preserved. But this potentially retains the reproducibility of the issue with your data, but removes virtually all personal information. You might still prefer to make such an exchange in private chat, in case anyone nefarious is interested in size of task list, or timestamps, tags, projects, etc. You might replace all ascii letters in those too, or replace all datetime strings with 2000-01-01T00:00:00 or similar.

bradyt commented 2 years ago

To avoid confusion in the following comment, what you describe as taskw-dart, I'll write as task add, referring to the mobile or GUI or Flutter app named task add on iOS and Android stores.

Another client app you could consider, is the small python tool taskd-client-py, which inspired some early prototypes of this app. I would be curious if it too can successfully receive data from your server account. If your data worked with Taskwarrior and taskwc2, I'm not surprised, as they have code in common with Taskserver. But the fact it works on Foreground but not taskw-dart, does have me more curious.

I've also been trying to think of the easiest way for you to iterate on testing your data, in the hopes you can see if bisecting preserves some error.

I wondered if bisecting would not work, if the issue somehow relied on there being lots of tasks, and with non-ascii. But I already have some integration tests with a large number of tasks, and I tried adding some sizable strings with non-ascii there, and I still could not find steps to reproduce your issue.

When I thought about how you might test Taskserver locally, I considered it would not be trivial to then test task add against a locally running Taskserver. When I am testing the app against a Taskserver on localhost, it is always with task add running on the same machine. You may not be so interested in installing all the Dart and Flutter tools needed to run the app from source code, as that can for example require Xcode on macOS, or Visual Studio on Windows.

However, if you only installed Dart, you could potentially run a small CLI I have in the project, named mesh, which currently only has two prototype subcommands there, mesh statistics and mesh next. I've created a git branch named scratch/api-subcommand-prototype, which could potentially reproduce your issue with project source code. So mesh api will send a sync request to the server with empty payload, which server should reply back with payload with entire list. So mesh api will attempt to decode the response and print the headers, but presumably the decode will fail again with your data, before the header can be displayed.

To install, you would first need Dart, which might be done via this page: https://dart.dev/get-dart. Then try the following:

cd taskw
dart pub get
dart run bin/mesh.dart statistics
dart run bin/mesh.dart api

This will default to using your ~/.taskrc config, and should use the same code as the mobile uses, so I expect you should be able to reproduce the FreeCinc issue with this. But the benefit is, it would be not too hard to then get Taskserver setup to run on localhost, reconfigure ~/.taskrc to point to this local Taskserver, and potentially reproduce the issue again with mesh and local Taskserver. At this point, bisecting the data should be pretty easy.

If you think this sounds like a good idea, but get stuck on some aspect, please do let me know.

bradyt commented 2 years ago

Can you try syncing with a fresh FreeCinc account? That at least would tell us that you have the data to reproduce the issue. If the issue does not reproduce with same local data going across a new FreeCinc account, you might want to ask the FreeCinc maintainers to send you a copy of your tx.data files, that is, $TASKDDATA/orgs/$org/users/$key/tx.data.

Either way, I hope you'll keep a backup copy of data that reproduces this issue, so that someday we might find a way we can narrow this down.

bradyt commented 2 years ago

No reproduce, and no reply, closing issue.