GothenburgBitFactory / tasklib

A Python library for interacting with taskwarrior databases.
http://tasklib.readthedocs.org/en/latest/
BSD 3-Clause "New" or "Revised" License
146 stars 27 forks source link

Be robust to invalid utf-8 characters in task db #115

Open bergercookie opened 2 years ago

bergercookie commented 2 years ago

Sometimes, there may exist non-printable characters in the taskwarrior pending.data file, for example due to an emoji added to the task description but is not yet properly parsed by python. In these cases, we'd want tasklib not to crash but rather ignore it and keep parsing the results of the command.

smemsh commented 6 months ago

Not entirely sure that ignoring the error is the way to go, because it means the records obtained from taskwarrior will be inconsistent with the actual stored data. In fact it shouldn't be getting invalid text at all that cannot be decoded probably... it doesn't support arbitrary binary data.

Here is a small reproducer which tries to store the character '🦀':

# add the rustlang crab emoji as task annotation from shell prompt:
task 1 annotate -- $'\U0001f980\'

After this, tasklib will crash when reading the task database. I am not sure if this is rather a bug in taskwarrior, as it doesn't seem to get encoded correctly; after the above annotation, task 1 edit shows the annotation as two 16-bit characters 0xd83e and 0xdd80 rather than the single character 0x1f980. However, if this character is manually inserted with task 1 edit and saved, then both taskwarrior and tasklib seem to handle it just fine.

@bergercookie did you get the character into the database by specifically using task annotate ?

smemsh commented 6 months ago

Note that the latest development version of taskwarrior seems to store this fine, see GothenburgBitFactory/taskwarrior#3286

bergercookie commented 3 months ago

@bergercookie did you get the character into the database by specifically using task annotate ?

This PR is like 2 years back so I honestly have no idea how I had reproduced this :sweat_smile:

smemsh commented 3 months ago

I think the issue should be closed, because the problem doesn't exist anymore.