gomodule / redigo

Go client for Redis
Apache License 2.0
9.76k stars 1.25k forks source link

Unhandled text encoding differences #587

Closed TropicalPenguin closed 2 years ago

TropicalPenguin commented 2 years ago

Hello,

I recently started using the latest redisgraph from https://github.com/RedisGraph/RedisGraph master branch, revision 7d550c74 (I was previously using a457762e). And I've found that when retrieving content containing some multi-line text (UNIX-style, LF) previously stored in the database, the redigo function func (c *conn) readLine() ([]byte, error) (in conn.go) has begun to return protocolError("bad response line terminator").

The text in question (copied from redis-cli; I first noticed it with Arabic text, but it turns out the language doesn't matter; the same occurs with English text with lines separated by LF, '\n'):

أَأَنتَ أَم أَنا هَذا في إِلَهَينِ
حاشاكَ حاشاكَ مِن إِثباتِ اِثنَينِ
هُوِيَّةٌ لَكَ في لائِيَّتي أَبَداً
كُلّي عَلى الكُلِّ تَلبيسُ بِوَجهَينِ
فَأَينَ ذاتُكَ عَنّي حَيثُ كُنتُ أرى
فَقَد تَبَيَّنَ ذاتي حَيثُ لا أَيني
فَأَينَ وَجهُكَ مَقصوداً بِناظِرَتي
في باطِنِ القَلبِ أَم في ناظِرِ العَينِ
بَيني وَبَينَكَ إِنِيٌّ يُنازِعُني
فَاِرفَع بِلُطفِكَ إِنِيِّ مِنَ البَينِ

Yields first a byte array: [43 216 163 217 142 216 163 217 142 217 134 216 170 217 142 32 216 163 217 142 217 133 32 216 163 217 142 217 134 216 167 32 217 135 217 142 216 176 216 167 32 217 129 217 138 32 216 165 217 144 217 132 217 142 217 135 217 142 217 138 217 134 217 144 10], which corresponds to the string: +أَأَنتَ أَم أَنا هَذا في إِلَهَينِ

I didn't get very far injecting a brief hack before the condition expecting a carriage return :

    if i > 0 && p[i] != '\r' && p[i+1] == '\n' {
        if p[0] == '+' {
            p = append(p, '\n')
            i = i + 1
            p[i] = '\r'
        } else {
            p = append([]byte{'+'}, p[:i+1]...)
            p = append(p, '\r', '\n')
            i = i + 2
        }
    } else if i == -1 && p[0] == '\n' {
        p = []byte{'+', '\r', '\n'}
        i = 1
    }

which yields:

+ﺃَﺄَﻨﺗَ ﺄَﻣ ﺄَﻧﺍ ﻩَﺫﺍ ﻒﻳ ﺈِﻠَﻬَﻴﻧِ 
+
+�ﺎﺷﺎﻛَ ﺡﺎﺷﺎﻛَ ﻢِﻧ ﺈِﺜﺑﺎﺗِ ﺎِﺜﻨَﻴﻧِ
+ﻩُﻮِﻳَّﺓٌ ﻞَﻛَ ﻒﻳ ﻼﺌِﻴَّﺘﻳ ﺄَﺑَﺩﺍً
ﻚُﻠّﻳ ﻊَﻟﻯ ﺎﻠﻜُﻟِّ ﺖَﻠﺒﻴﺳُ ﺏِﻮَﺠﻬَﻴﻧِ
+ﻑَﺄَﻴﻧَ ﺫﺎﺘُﻛَ ﻊَﻨّﻳ ﺢَﻴﺛُ ﻚُﻨﺗُ ﺃﺭﻯ
+ﻒَﻗَﺩ ﺖَﺒَﻴَّﻧَ ﺫﺎﺘﻳ ﺢَﻴﺛُ ﻻ ﺄَﻴﻨﻳ

After that sixth non-empty line (haven't looked into why the extra line - consisting only of a newline character - is created) redisgraph-go just crashed after receiving an empty response, so I thought I should probably defer to someone more familiar with this code...

I tried reverting my redisgraph.so to be able to contrast with the old encoding, and restarting redis... but the server failed to start with the error 'The RDB file contains module data for the module '' that is not terminated by the proper module value EOF marker', so it seems the former encoding has been replaced upon upgrade.

UPDATE 2021/12/18: If I try creating multi-line text with CRLF instead of just LF, instead of "bad response line terminator", redigo reports "unexpected response line"

redis-server --version Redis server v=6.0.15 sha=00000000:0 malloc=jemalloc-5.2.1 bits=64 build=4610f4c3acf7fb25

TropicalPenguin commented 2 years ago

Automatically resolved by https://github.com/RedisGraph/RedisGraph/pull/2078