gorse-io / gorse

Gorse open source recommender system engine
https://gorse.io
Apache License 2.0
8.62k stars 785 forks source link

Exception caused shutdown #783

Open ibraheemalayan opened 1 year ago

ibraheemalayan commented 1 year ago

Gorse version using the latest gorse-in-one docker image. ( 0.4.14 )

Describe the bug I have no clue how it occurred, I sent a request for neighbouring items, the load balancer returned a 503 Service Unavailable, after reviewing the container it was shutdown and the last logs contain the following:

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   timestamp   |                                                                                                              message                                                                                                              |
|---------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1695724922516 | {"level":"info","ts":1695724922.5168717,"msg":"load config","config":"/etc/gorse/config.toml"}                                                                                                                                    |
| 1695724922518 | {"level":"info","ts":1695724922.5180194,"msg":"load cache","path":"/var/lib/gorse/master_cache.data"}                                                                                                                             |
| 1695724922518 | {"level":"info","ts":1695724922.5180697,"msg":"no local cache found, create a new one","path":"/var/lib/gorse/master_cache.data"}                                                                                                 |
| 1695724922615 | {"level":"info","ts":1695724922.6155522,"msg":"start model fit","period":3600}                                                                                                                                                    |
| 1695724922615 | {"level":"info","ts":1695724922.6157806,"msg":"start model searcher","period":21600}                                                                                                                                              |
| 1695724922616 | {"level":"info","ts":1695724922.6160376,"msg":"load dataset","positive_feedback_types":["place_order","add_to_cart","save_listing"],"read_feedback_types":["open_product_page","see_product_card"],"item_ttl":0,"feedback_ttl":0} |
| 1695724922616 | {"level":"info","ts":1695724922.6163213,"msg":"start rpc server","host":"0.0.0.0","port":8086}                                                                                                                                    |
| 1695724922621 | {"level":"info","ts":1695724922.6212022,"msg":"start http server","url":"http://0.0.0.0:8088","cors_methods":[],"cors_doamins":[]}                                                                                                |
| 1695724922666 | {"level":"info","ts":1695724922.6667936,"msg":"prepare to fit click model","n_jobs":1}                                                                                                                                            |
| 1695724922666 | {"level":"warn","ts":1695724922.666884,"msg":"empty ranking dataset","positive_feedback_type":["place_order","add_to_cart","save_listing"]}                                                                                       |
| 1695724922666 | {"level":"info","ts":1695724922.666905,"msg":"start searching neighbors of users","n_cache":100}                                                                                                                                  |
| 1695724922670 | panic: runtime error: index out of range [-1]                                                                                                                                                                                     |
| 1695724922670 | goroutine 89 [running]:                                                                                                                                                                                                           |
| 1695724922670 | github.com/zhenghaoz/gorse/base/search.(*IVF).Build.func1(0x50?, 0x0)                                                                                                                                                             |
| 1695724922670 |  /go/gorse/base/search/ivf.go:222 +0x38d                                                                                                                                                                                          |
| 1695724922670 | github.com/zhenghaoz/gorse/base/parallel.Parallel(0x3, 0xc00038e360?, 0xc00002c050)                                                                                                                                               |
| 1695724922670 |  /go/gorse/base/parallel/parallel.go:39 +0xf7                                                                                                                                                                                     |
| 1695724922670 | github.com/zhenghaoz/gorse/base/search.(*IVF).Build(0xc0004108a0)                                                                                                                                                                 |
| 1695724922670 |  /go/gorse/base/search/ivf.go:209 +0x525                                                                                                                                                                                          |
| 1695724922670 | github.com/zhenghaoz/gorse/base/search.(*IVFBuilder).Build(0xc00002c000, 0x3f4ccccd, 0x3, 0x64?, 0xc000160070)                                                                                                                    |
| 1695724922670 |  /go/gorse/base/search/ivf.go:294 +0x175                                                                                                                                                                                          |
| 1695724922670 | github.com/zhenghaoz/gorse/master.(*Master).findUserNeighborsIVF(0xc000245c00, 0xc0009a5d40, {0x3095650, 0x0, 0x0}, {0xc000100a80, 0x18, 0x18}, 0xc000100900, 0xc0000a2e40)                                                       |
| 1695724922670 |  /go/gorse/master/tasks.go:780 +0x478                                                                                                                                                                                             |
| 1695724922670 | github.com/zhenghaoz/gorse/master.(*FindUserNeighborsTask).run(0xc000010f00, 0x1814fb0?)                                                                                                                                          |
| 1695724922670 |  /go/gorse/master/tasks.go:651 +0x606                                                                                                                                                                                             |
| 1695724922670 | github.com/zhenghaoz/gorse/master.(*Master).RunPrivilegedTasksLoop.func2({0x1e141a0, 0xc000010f00})                                                                                                                               |
| 1695724922670 |  /go/gorse/master/master.go:330 +0x18c                                                                                                                                                                                            |
| 1695724922670 | created by github.com/zhenghaoz/gorse/master.(*Master).RunPrivilegedTasksLoop                                                                                                                                                     |
| 1695724922670 |  /go/gorse/master/master.go:326 +0x74e                                                                                                                                                                                            |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

As you can see the error is mainly an index out of range [-1]

Additional context It was working fine for about a week.

Using postgres as datastore, redis as cache store

TreehouseFalcon commented 1 year ago

I made a PR to fix this a few weeks ago, try using this Docker image hash since there is no version cut with this fix: zhenghaoz/gorse-master@sha256:026d1bd4ad3f861bb45be0d5f141bfe90ce244dfbefaf85a343dc0276a8100b0

That's the hash for gorse-master, you'll have to find one for gorse-in-one if you're using that.

cc @zhenghaoz Would it be a good idea to cut 0.4.15 with this fix?