Closed kudrykv closed 4 years ago
Great work. Thank you.
@kudrykv I pulled this locally and built it and now I am getting normal results only for the query e
. All other queries, for example, g
match nothing anymore. Any idea what is going on here?
I've reverted the binary file in the release back to the previous version. It would be helpful to me if you paste the unicode text of your upper and lower case non-latin character search and note title so I can test your test cases. If you are able to debug this with my latin text cases using e
and g
, that would be helpful as well.
I am getting normal results only for the query e. All other queries, for example, g match nothing anymore
@drgrib That is weird 😳 . I'll look in other binaries, whether I broke anything. Did you run search
or other binaries? Can you please give me more details? I have built and replaced my search
binary, it worked
unicode text of your upper and lower case non-latin character search and note title so I can test your test cases
e.g. СВІДОЦТВО
in the Bear note, and then свідоцтво
in Alfred. When I was doing that, I noticed that the library lowercases the search term (so it becomes свідоцтво
), but in the note is is mixed case (СВІДОЦТВО
). Then the query didn't work -- forced lowercased with Go didn't match not-lowercased non-ascii in sqlite.
@kudrykv Yeah, I am really at a loss why this isn't working for me because the code looks correct. But I'm the one who produces the binary file used for the workflow, so if my build is broken, it breaks it for everyone.
I ran this program as a sanity test and all the calls worked as expected, reproducing the bad result for lower
and good results for utflower
:
package main
import (
"database/sql"
"fmt"
"strings"
"github.com/mattn/go-sqlite3"
)
func testStringQuery(db *sql.DB, q string) {
row := db.QueryRow(q)
var s string
err := row.Scan(&s)
if err != nil {
panic(err)
}
fmt.Println(s)
fmt.Println()
}
func utfLower(s string) string {
return strings.ToLower(s)
}
func main() {
var q string
path := "test.db"
sql.Register("sqlite3_custom", &sqlite3.SQLiteDriver{
ConnectHook: func(conn *sqlite3.SQLiteConn) error {
return conn.RegisterFunc("utflower", utfLower, true)
},
})
db, err := sql.Open("sqlite3_custom", path)
if err != nil {
panic(err)
}
q = `SELECT lower('Hello world')`
testStringQuery(db, q)
q = `SELECT lower('Б')`
testStringQuery(db, q)
q = `SELECT utflower('Hello world')`
testStringQuery(db, q)
q = `SELECT utflower('Б')`
testStringQuery(db, q)
q = `SELECT utflower('g')`
testStringQuery(db, q)
q = `SELECT utflower('G')`
testStringQuery(db, q)
fmt.Println("--")
fmt.Println()
fmt.Println(utfLower("Б"))
fmt.Println(utfLower("g"))
}
Let me debug what is going wrong in my local Bear repo and I'll get back to you.
So here is what I've found. I can get the query to work if I change it to this:
NOTES_BY_QUERY = `
SELECT DISTINCT
note.ZUNIQUEIDENTIFIER, note.ZTITLE, group_concat(tag.ZTITLE)
FROM
ZSFNOTE note
LEFT OUTER JOIN Z_7TAGS nTag ON note.Z_PK = nTag.Z_7NOTES
LEFT OUTER JOIN ZSFNOTETAG tag ON nTag.Z_14TAGS = tag.Z_PK
WHERE
note.ZARCHIVED=0
AND note.ZTRASHED=0
AND (
utflower(note.ZTITLE) LIKE utflower('%%%s%%') OR
lower(note.ZTEXT) LIKE lower('%%%s%%')
)
GROUP BY note.ZUNIQUEIDENTIFIER
ORDER BY case when utflower(note.ZTITLE) LIKE utflower('%%%s%%') then 0 else 1 end, note.ZMODIFICATIONDATE DESC
LIMIT 400`
Notice that I use the built-in lower
for the ZTEXT
condition. Here is what I suspect: there is a limit on how large a string
can be that is sent to the RegisterFunc
function. I have some massive ZTEXT
in some of my notes.
What confuses me about this is that there is no error when I use utflower
for ZTEXT
. It just returns no results. Maybe sqlite3
has poor handling for function failures and returns zero results instead of an error? I'll take a glance at their RegisterFunc
code to see if I find evidence for this. But I'm open to insight you have as well.
@drgrib wow, that's a great finding! I'll also take a look what are the options there.
I've tried a synthetic case with 100 million characters, SQLite did fine:
package bearkeek
import (
"database/sql"
"math/rand"
"strings"
"testing"
"time"
"github.com/mattn/go-sqlite3"
)
func init() {
rand.Seed(time.Now().UnixNano())
sql.Register("sqlite_custom", &sqlite3.SQLiteDriver{
ConnectHook: func(conn *sqlite3.SQLiteConn) error {
return conn.RegisterFunc("utflower", strings.ToLower, true)
},
})
}
func TestTest(t *testing.T) {
open, err := sql.Open("sqlite_custom", "./test.sqlite3")
if err != nil {
t.Fatal(err)
}
if _, err = open.Exec("create table if not exists tbl(texty text);"); err != nil {
t.Fatal(err)
}
if _, err = open.Exec("delete from tbl"); err != nil {
t.Fatal(err)
}
searchTerm := "BANANA БАНАН"
theBigNote := RandStringRunes(50000000) + searchTerm + RandStringRunes(50000000)
if _, err = open.Exec("insert into tbl (texty) values (?)", theBigNote); err != nil {
t.Fatal(err)
}
query, err := open.Query("select texty from tbl where utflower(texty) like utflower(?)", "%"+searchTerm+"%")
if err != nil {
t.Fatal(err)
}
if query.Err() != nil {
t.Fatal(query.Err())
}
if !query.Next() {
t.Fatal("no rows")
}
var got string
if err = query.Scan(&got); err != nil {
t.Fatal(err)
}
t.Log(len([]rune(got)))
}
var letterRunes = []rune(
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" +
"абвгґдеєжзиіїйклмнопрстуфхцчшщьюяАБВГҐДЕЄЖЗИІЇЙКЛМНОПРСТУФХЦЧШЩЬЮЯ",
)
func RandStringRunes(n int) string {
b := make([]rune, n)
for i := range b {
b[i] = letterRunes[rand.Intn(len(letterRunes))]
}
return string(b)
}
It returned me the text (logging just length b/c that's too much to print).
I need to take a closer look on Bear table schema then.
Yeah, I tried large sizes as well and size did not seem to be the problem. So far I have discovered that the query works if I:
utflower(notes.ZTEXT)
to lower(notes.ZTEXT)
OR
GROUP BY ...
, ORDER BY ...
, and group_concat(...
elements)I'm really confused what is going on here. This is the minimal change I can get to function correctly, which is not desirable but better than nothing: https://github.com/drgrib/alfred-bear/pull/32.
SQLite3 lowercase and uppercase only ASCII. It ignores other charsets.
Bear knows how to search that:
bs
this note in recent, but it can't find it via word search:The PR addresses the issue leveraging the Golang lowercase, making it work: