GothenburgBitFactory / taskwarrior

Taskwarrior - Command line Task Management
https://taskwarrior.org
MIT License
4.4k stars 298 forks source link

task description gets truncated when using multibyte characters #2215

Open zoyo-de opened 5 years ago

zoyo-de commented 5 years ago

When displaying reports, the task description gets truncated at the end by the number of multibyte characters in the description, when the task contains annotations and the description line is shorter than the annotation line.

Let me give an example session with german umlauts as multibyte characters (some unimportant lines deleted).

First we create and annotate a task without umlauts:

$ task add +test "abcdefg"
Created task 104.
$ task 104 annotate "bla foo bar"
Annotating task 104 'abcdefg'.
Annotated 1 task.
$ task +test
[task next ( +test )]

ID  Age Tag  Description              Urg 
104 12s test abcdefg                     0
               2019-09-15 bla foo bar     

Everything is displayed fine. Now we add umlauts:

$ task 104 mod "äöü abcdefg"
Modifying task 104 'äöü abcdefg'.
Modified 1 task.
$ task +test
[task next ( +test )]

ID  Age Tag  Description              Urg 
104 28s test äöü abcd                    0
               2019-09-15 bla foo bar     

The description line is truncated by the number of umlauts. Now we add some fill characters so, that the description line gets longer than the annotation line:

$ task 104 mod "äöü sssssssssssssss abcdefg"
Modifying task 104 'äöü sssssssssssssss abcdefg'.
Modified 1 task.
$ task +test
[task next ( +test )]

ID  Age  Tag  Description                 Urg 
104 1min test äöü sssssssssssssss abcdefg    0
                2019-09-15 bla foo bar        

The description line is not truncated anymore. Let's adjust the number of fill characters:

$ task 104 mod "äöü ssssssssssss abcdefg"
Modifying task 104 'äöü ssssssssssss abcdefg'.
Modified 1 task.
$ task +test
[task next ( +test )]

ID  Age  Tag  Description              Urg 
104 1min test äöü ssssssssssss abcdefg    0
                2019-09-15 bla foo bar     

It still works. Remove one more fill character:

$ task 104 mod "äöü sssssssssss abcdefg"
Modifying task 104 'äöü sssssssssss abcdefg'.
Modified 1 task.
$ task +test
[task next ( +test )]

ID  Age  Tag  Description              Urg 
104 1min test äöü sssssssssss abcd        0
                2019-09-15 bla foo bar     

The end characters get truncated again.

And finally, the output of "task diag":

$ task diag

task 2.5.1
   Platform: Linux

Compiler
    Version: 7.2.0
       Caps: +stdc +stdc_hosted +LP64 +c8 +i32 +l64 +vp64 +time_t64
 Compliance: C++11

Build Features
      CMake: 3.9.5
    libuuid: libuuid + uuid_unparse_lower
  libgnutls: 3.5.8
 Build type: None

Configuration
       File: /home/cs/.taskrc (found), 3529 bytes, mode 100644
       Data: /home/cs/.task (found), dir, mode 40755
    Locking: Enabled
         GC: Enabled
     Server: <censored>
         CA: /home/cs/.task/ca.cert.pem, readable, 3731 bytes
      Trust: ignore hostname
Certificate: /home/cs/.task/client.cert.pem, readable, 3714 bytes
        Key: /home/cs/.task/client.key.pem, readable, 24742 bytes
    Ciphers: NORMAL
      Creds: PRIV/cs/************************************

Hooks
     System: Enabled
   Location: /home/cs/.task/hooks
             (-none-)

Tests
      $TERM: xterm-256color (323x70)
       Dups: Scanned 176 tasks for duplicate UUIDs:
             No duplicates found
 Broken ref: Scanned 176 tasks for broken references:
             No broken references found
ChargingBulle commented 5 years ago

Very important issue, since we germans use umlauts all the time.

bjornfor commented 5 years ago

It's important to scandinavians too (I'm affected!).

zoyo-de commented 5 years ago

line_length counts the visible characters, not the bytes in the source string. So this looks fine on the first glimpse:

diff --git a/src/text.cpp b/src/text.cpp
index f5e3496be..b96393c33 100644
--- a/src/text.cpp
+++ b/src/text.cpp
@@ -248,7 +248,7 @@ bool extractLine (
     // Premature EOL.
     if (character == '\n')
     {
-      line = text.substr (offset, line_length);
+      line = text.substr (offset, cursor-offset-1);
       offset = cursor;
       return true;
     }
mordae commented 5 years ago

Works for me and valgrind seems happy about this as well.