GothenburgBitFactory / taskwarrior

Taskwarrior - Command line Task Management
https://taskwarrior.org
MIT License
4.45k stars 305 forks source link

`task export` incorrectly formats numeric values if they are not actually numeric #3330

Open ashprice opened 7 months ago

ashprice commented 7 months ago

The basic process and the error message is thus:

$ task export > all-tasks.json
$ pacman -Syu task
... installs task 3.0.0 ...
$ task import all-tasks.json
Importing 'all-tasks.json'
Error: missing ']' at position 1

Output of task diag:

task 2.6.2
   Platform: Linux

Compiler
    Version: 12.2.1 20230201
       Caps: +stdc +stdc_hosted +LP64 +c8 +i32 +l64 +vp64 +time_t64
 Compliance: C++17

Build Features
     Commit: 7400e6ed6
      CMake: 3.26.0
    libuuid: libuuid + uuid_unparse_lower
  libgnutls: 3.8.0
 Build type: release

Configuration
       File: /home/hearth/.taskrc (found), 4753 bytes, mode 100644
       Data: /home/hearth/.task (found), dir, mode 40777
    Locking: Enabled
         GC: Enabled
     Server: 
         CA: -
Certificate: -
        Key: -
      Trust: strict
    Ciphers: NORMAL
      Creds: 

Hooks
     System: Enabled
   Location: /home/hearth/.task/hooks
             (-none-)

Tests
   Terminal: 302x51
       Dups: Scanned 4850 tasks for duplicate UUIDs:
             No duplicates found
 Broken ref: Scanned 4850 tasks for broken references:
             No broken references found

So... I see in another git issue here that it was surprising Arch was upgrading people with no warning. It seems that, actually, something else is going on?

I note that the PKBUILD in the Arch /extra/ repository looks fine to me: https://gitlab.archlinux.org/archlinux/packaging/packages/task/-/blob/main/PKGBUILD?ref_type=heads

So I am not sure what is up here!

I also don't know if this is actually the cause of my issue - it could be something entirely unrelated to the upgrade. I mean, if I am indeed on 2.6.2, import should work as expected for 2.6.2!

My package manager upgraded me without me being aware of breaking changes. So I was actually on whatever the Arch repositories says is 3.0.0 for a while. I didn't notice any breakages whatsoever. Given it seems that 3.0.0 is actually 2.6.2 here, that is probably why...

Current plan of action: build task 3.0.0 myself, and see if task import works without a hitch.

The question I have in case it doesn't, I guess, would be if there's some obvious regex to someone familiar with the relevant .json structure I can run to catch any breakages in the .json. Probably it would be easy to spot with a small file, but I have just shy of 5,000 lines.

Thank you for your work!

ashprice commented 7 months ago
$ which task
/usr/local/bin/task
$ /usr/bin/task diag
Found existing '.data' files in /home/hearth/.task
  Taskwarrior's storage format changed in 3.0, requiring a manual migration.
  See https://github.com/GothenburgBitFactory/taskwarrior/releases.

task 3.0.0
   Platform: Linux

Compiler
    Version: 13.2.1 20230801
       Caps: +stdc +stdc_hosted +LP64 +c8 +i32 +l64 +vp64 +time_t64
 Compliance: C++17

Build Features
     Commit: 8a0a98d3e
      CMake: 3.29.0
    libuuid: libuuid + uuid_unparse_lower
 Build type: 

Configuration
       File: /home/hearth/.taskrc (found), 4753 bytes, mode 100644
       Data: /home/hearth/.task (found), dir, mode 40777
    Locking: Enabled
         GC: Enabled
Hooks
     System: Enabled
   Location: /home/hearth/.task/hooks
             (-none-)

Tests
   Terminal: 302x52
       Dups: Scanned 0 tasks for duplicate UUIDs:
             No duplicates found
 Broken ref: Scanned 0 tasks for broken references:
             No broken references found

that explains that one somewhat.

/usr/bin/task import all-tasks.json
Found existing '.data' files in /home/hearth/.task
  Taskwarrior's storage format changed in 3.0, requiring a manual migration.
  See https://github.com/GothenburgBitFactory/taskwarrior/releases.
Importing 'all-tasks.json'
Error: missing value at position 2752835
ashprice commented 7 months ago

OK. The issue was a value there with priority:M. I use other priority values, 1-5. Changing the byte at that position so the relevant string reflects priority:2 has allowed import to proceed. I wonder - is it worth making a note in the documentation somewhere before I close this issue?

ashprice commented 7 months ago

Also: I am thinking my issues with /usr/bin/ vs. /usr/local might be because a long time ago I was toying with modifying taskwarrior code. Probably, I lazily ran a make install command or something (I don't actually remember how to build TW offhand) instead of a PKGBUILD, without forseeing the consequences.

djmitche commented 7 months ago

The task export output is produced by a custom-written JSON encoder, and task import uses a custom parser, too. I don't know if those have been tested thoroughly enough to be confident that they generate correct JSON or correctly parse JSON (or if the parser correctly parses incorrect JSON created by the encoder!)

That said, we can try to fix the parser if it's buggy, or needs to be more lenient in what it accepts. Could you provide a redacted, reduced version of the JSON file that caused this error?

That reminds me, my own import failed on some invalid Unicode encoding in the exported JSON. #3333 tracks that.

ashprice commented 7 months ago

There's a decent amount of personal info spread throughout the JSON and I wouldn't be so confident I'd managed to strip it all out.

However, I can share the offending line, given I fixed it and it tells me exactly where it is:

{"id":0,"description":"blah blah blah","due":"20210101T000000Z","end":"20220829T150554Z","entry":"20220829T142746Z","modified":"20220829T150554Z","priority":M,"status":"completed","uuid":"8a041edc-733a-4d6c-b352-efd2943c29da","urgency":17.9},

As mentioned above, the issue seems to be "priority":M, which I do not use. Changing it to "priority":5 fixes the issue. Vim tells me that character is <M> 77, Hex 4d, Octal 115 so it seems to be just a regular M.

ashprice commented 7 months ago

I'm guessing maybe it would expect letters like that to be eg. "priority":"M"?

djmitche commented 7 months ago

Yeah, a bare M is definitely not valid JSON. I'm confused how that would occur, though -- my export has, for example,

{"id":43,"description":"get dog food","entry":"20240205T222439Z","modified":"20240205T222439Z","priority":"M","status":"pending","uuid":"b0091a73-6810-469c-8d42-aff09c6632ef","wait":"20240505T222439Z","tags":["buy","next"],"urgency":16.2178},

When you say you "do not use" that, what does that mean?

ashprice commented 7 months ago

I mean that I don't use the default priority UDAs. That one probably made it through the cracks somehow, maybe when I changed over to my current UDAs I ran the command only on pending tasks or a specific context or something. Instead, I have:

uda.priority.label=priority
uda.priority.type=numeric
uda.priority.values=5,4,3,2,1,
urgency.uda.priority.5.coefficient=6.0
urgency.uda.priority.4.coefficient=4.5
urgency.uda.priority.3.coefficient=3.0
urgency.uda.priority.2.coefficient=1.5
urgency.uda.priority.1.coefficient=0
urgency.uda.priority..coefficient=0

Before upgrading these were floats:

uda.priority.label=priority
uda.priority.type=numeric
uda.priority.values=5.000000,4.000000,3.000000,2.000000,1.000000,
urgency.uda.priority.5.coefficient=6.0
urgency.uda.priority.4.coefficient=4.5
urgency.uda.priority.3.coefficient=3.0
urgency.uda.priority.2.coefficient=1.5
urgency.uda.priority.1.coefficient=0
urgency.uda.priority..coefficient=0

but it would display the floats as not-floats in my reports and assign the urgency correctly. Not sure why it changed, but it doesn't seem very important given stuff works.

djmitche commented 7 months ago

Oh, interesting! Indeed, if I add a task with a default .taskrc:

[
{"id":1,"description":"foo","entry":"20240405T013424Z","modified":"20240405T013424Z","priority":"M","status":"pending","uuid":"dc75a11d-afcf-45d2-9fde-21e4fe52e85e","urgency":3.9}
]

But once I add that UDA configuration, TW assumes that the priority is numeric and produces invalid JSON:

[
{"id":1,"description":"foo","entry":"20240405T013424Z","modified":"20240405T013424Z","priority":M,"status":"pending","uuid":"dc75a11d-afcf-45d2-9fde-21e4fe52e85e","urgency":3.9}
]

So that's definitely a bug in task export.

djmitche commented 5 months ago

I'm honestly not sure what TW should do -- with a definition of priority as a numeric type, but with non-numeric values in the task DB, what should it do? Assume they are zero or missing? Error out when these values are spotted (thus requiring manual DB editing??)? Should task export just produce a string in this case and let the other operations with columns deal with this as they will?

@tbabej, any guidance here?

tbabej commented 4 months ago

Even if we fix the export formatting here, the fact that the field contains unexpected value type remains, and would likely cause problem (or silent bugs) in things that involve that attribute - i.e. in this case, if urgency is being calculated.

I guess we should not completely error out, but any invalid attributes should be regarded as not set, and should emit warnings to the user. Am I correct in assuming that manual DB edits in the current model could be prevented by the user running simply mod operation to override the attribute values in the affected tasks?

djmitche commented 4 months ago

I think task mod could fix this case, yes.

I think the issue here is that task export is producing invalid JSON, e.g., {"priority": M}. Since the types of the JSON values are user-controllable via UDA, I suppose there are two options here:

It sounds like you're suggesting the second option?