Open jackmaninov opened 4 months ago
Ok, I think I found something. It is something to do with my conditional role "filtering" and the lack of a pre-filter on my record list block. The record list seems to be erroring out in some circumstances where the conditional role is disallowing reads to a record that would otherwise be shown in the record list.
I'll see if I can work up a small, reproducible test case...
Ok, I have a reproducible demo of the panic running locally. Gist here: https://gist.github.com/jackmaninov/482d195f9fd7969ec3a2ca3c18d6da53
Users: admin@corteza-test.com / FgR3924bpT@9p73j user2@corteza-test.com / jTW^!&@%vW74%hhc
With the database loaded with panic.sql, and when logged in as User2, going to the "Panic Debug" namespace and viewing the record list for "Debug Module", the user should see Could not load record list: Request failed with status code 500
and the server will log a panic. User2 is expected to be able to see 11 records that they own in the module (due to the conditional role and RBAC setup), and there are 4 records owned by the Admin and created BEFORE User2's records, that they shouldn't be able to see
I think the problem is as follows:
If User2 creates enough records (15) so that 14 records still exist after the RBAC filtering occurs, it seems to get past this panic (database state no-panic.sql), however I get the same panic if I hit Next
to page forward, then Previous
to return. So I'm guessing the problem is with the record cursor / pagination logic when RBAC is removing viewable records from a "viewport".
A sample panic again, from running under this local setup:
runtime error: invalid memory address or nil pointer dereference
================================================================================
--------------------------------------------------------------------------------
goroutine 3703 [running]:
runtime/debug.Stack()
/opt/hostedtoolcache/go/1.19.13/x64/src/runtime/debug/stack.go:24 +0x65
runtime/debug.PrintStack()
/opt/hostedtoolcache/go/1.19.13/x64/src/runtime/debug/stack.go:16 +0x19
github.com/cortezaproject/corteza/server/pkg/api/server.panicRecovery({0x2abfcf0, 0xc0015ee300}, {0x2abd510, 0xc00088e0e0})
/home/runner/work/corteza/corteza/server/pkg/api/server/middleware.go:43 +0x111
panic({0x1f70ce0, 0x3b40bb0})
/opt/hostedtoolcache/go/1.19.13/x64/src/runtime/panic.go:884 +0x212
github.com/getsentry/sentry-go/http.(*Handler).recoverWithSentry(0xc000bde890, 0x189bc2b?, 0xc000ea0c00)
/home/runner/go/pkg/mod/github.com/getsentry/sentry-go@v0.13.0/http/sentryhttp.go:117 +0xf0
panic({0x1f70ce0, 0x3b40bb0})
/opt/hostedtoolcache/go/1.19.13/x64/src/runtime/panic.go:884 +0x212
github.com/cortezaproject/corteza/server/compose/types.(*Record).GetValue(0x0, {0xc0016bc410, 0x9}, 0x0)
/home/runner/work/corteza/corteza/server/compose/types/getters_setters.gen.go:401 +0x450
github.com/cortezaproject/corteza/server/store/adapters/rdbms/dal.(*iterator).collectCursorValues(0xc000c2a580, {0x2ab3730, 0x0})
/home/runner/work/corteza/corteza/server/store/adapters/rdbms/dal/iterator.go:298 +0x1b4
github.com/cortezaproject/corteza/server/store/adapters/rdbms/dal.(*iterator).More(0xc000c2a580, 0x3?, {0x2ab3730?, 0x0?})
/home/runner/work/corteza/corteza/server/store/adapters/rdbms/dal/iterator.go:72 +0xe8
github.com/cortezaproject/corteza/server/compose/dalutils.drainIterator({_, _}, {_, _}, _, {0x573329a47000003, 0x573327bb1000003, {0x0, 0x
0}, 0x0, ...})
/home/runner/work/corteza/corteza/server/compose/dalutils/records.go:188 +0x42b
github.com/cortezaproject/corteza/server/compose/dalutils.ComposeRecordsList({_, _}, {_, _}, _, {0x573329a47000003, 0x573327bb1000003, {0x
0, 0x0}, 0x0, ...})
/home/runner/work/corteza/corteza/server/compose/dalutils/records.go:41 +0x165
github.com/cortezaproject/corteza/server/compose/service.record.Find.func1({0x2abfd98, 0xc00101d800}, 0xc00107a460, 0xc000276640, 0xc00107
a410, 0xc00107a428, 0xc00107a438, 0xc00107a5b0)
/home/runner/work/corteza/corteza/server/compose/service/record.go:387 +0x236
github.com/cortezaproject/corteza/server/compose/service.record.Find({{0x2ac8130, 0xc000a44240}, {0x2ab37d0, 0xc000d23bf0}, {0x2ad7fe0, 0x
c000d23d10}, {0x2ab4108, 0xc000456e88}, {0x2b13d78, 0xc000aaa100}, ...}, ...)
/home/runner/work/corteza/corteza/server/compose/service/record.go:400 +0x175
github.com/cortezaproject/corteza/server/compose/rest.(*Record).List(0xc0013719e0, {0x2abfd98, 0xc00101d800}, 0xc0009eb740)
/home/runner/work/corteza/corteza/server/compose/rest/record.go:146 +0x36a
github.com/cortezaproject/corteza/server/compose/rest/handlers.NewRecord.func2({0x2abd510, 0xc00088e0e0}, 0xc000ea1600)
/home/runner/work/corteza/corteza/server/compose/rest/handlers/record.go:93 +0x16f
net/http.HandlerFunc.ServeHTTP(0x2abfd98?, {0x2abd510?, 0xc00088e0e0?}, 0x1?)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2109 +0x2f
github.com/cortezaproject/corteza/server/pkg/auth.HttpTokenValidator.func1.1({0x2abd510, 0xc00088e0e0}, 0xc000ea1600)
/home/runner/work/corteza/corteza/server/pkg/auth/token_middleware.go:80 +0xba
net/http.HandlerFunc.ServeHTTP(0xc00095ea80?, {0x2abd510?, 0xc00088e0e0?}, 0xc000441b70?)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2109 +0x2f
github.com/go-chi/chi/v5.(*ChainHandler).ServeHTTP(0x1f566e0?, {0x2abd510?, 0xc00088e0e0?}, 0xc0009362c0?)
/home/runner/go/pkg/mod/github.com/go-chi/chi/v5@v5.0.7/chain.go:31 +0x2c
github.com/go-chi/chi/v5.(*Mux).routeHTTP(0xc001371980, {0x2abd510, 0xc00088e0e0}, 0xc000ea1600)
/home/runner/go/pkg/mod/github.com/go-chi/chi/v5@v5.0.7/mux.go:442 +0x216
net/http.HandlerFunc.ServeHTTP(0x2abfd98?, {0x2abd510?, 0xc00088f960?}, 0x3dc59b8?)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2109 +0x2f
github.com/cortezaproject/corteza/server/pkg/api/server.contextLogger.func1.1({0x2abd510, 0xc00088f960}, 0xc000ea1200)
/home/runner/work/corteza/corteza/server/pkg/api/server/logger.go:28 +0x394
net/http.HandlerFunc.ServeHTTP(0x2abfd98?, {0x2abd510?, 0xc00088f960?}, 0x3dc59b8?)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2109 +0x2f
github.com/cortezaproject/corteza/server/pkg/api.DebugToContext.func1.1({0x2abd510, 0xc00088f960}, 0xc000ea1100)
/home/runner/work/corteza/corteza/server/pkg/api/debug.go:15 +0x151
net/http.HandlerFunc.ServeHTTP(0x2abfd98?, {0x2abd510?, 0xc00088f960?}, 0x2a9db38?)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2109 +0x2f
github.com/go-chi/chi/v5/middleware.RequestID.func1({0x2abd510, 0xc00088f960}, 0xc000ea1000)
/home/runner/go/pkg/mod/github.com/go-chi/chi/v5@v5.0.7/middleware/request_id.go:76 +0x22d
net/http.HandlerFunc.ServeHTTP(0x2abfd98?, {0x2abd510?, 0xc00088f960?}, 0x2a9db38?)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2109 +0x2f
github.com/cortezaproject/corteza/server/pkg/api.RemoteAddrToContext.func1({0x2abd510, 0xc00088f960}, 0xc000ea0f00)
/home/runner/work/corteza/corteza/server/pkg/api/ipaddr.go:17 +0x16e
net/http.HandlerFunc.ServeHTTP(0xc000ea0f00?, {0x2abd510?, 0xc00088f960?}, 0xc000ea0f00?)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2109 +0x2f
github.com/go-chi/chi/v5/middleware.RealIP.func1({0x2abd510, 0xc00088f960}, 0xc000ea0f00)
/home/runner/go/pkg/mod/github.com/go-chi/chi/v5@v5.0.7/middleware/realip.go:35 +0x9e
net/http.HandlerFunc.ServeHTTP(0x2abfd98?, {0x2abd510?, 0xc00088f960?}, 0x3dc59b8?)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2109 +0x2f
github.com/cortezaproject/corteza/server/pkg/locale.DetectLanguage.func1.1({0x2abd510, 0xc00088f960}, 0xc000ea0e00)
/home/runner/work/corteza/corteza/server/pkg/locale/http.go:26 +0x2ad
net/http.HandlerFunc.ServeHTTP(0xc0007ff400?, {0x2abd510?, 0xc00088f960?}, 0xc000ea0e00?)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2109 +0x2f
github.com/go-chi/cors.(*Cors).Handler.func1({0x2abd510, 0xc00088f960}, 0xc000ea0e00)
/home/runner/go/pkg/mod/github.com/go-chi/cors@v1.2.1/cors.go:228 +0x1bd
net/http.HandlerFunc.ServeHTTP(0xc001778ff0?, {0x2abd510?, 0xc00088f960?}, 0xb?)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2109 +0x2f
github.com/getsentry/sentry-go/http.(*Handler).handle.func1({0x2abd510, 0xc00088f960}, 0xc000ea0d00)
/home/runner/go/pkg/mod/github.com/getsentry/sentry-go@v0.13.0/http/sentryhttp.go:103 +0x3c3
net/http.HandlerFunc.ServeHTTP(0xc001008570?, {0x2abd510?, 0xc00088f960?}, 0xc000c89800?)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2109 +0x2f
github.com/go-chi/chi/v5.(*Mux).ServeHTTP(0xc001370180, {0x2abd510, 0xc00088f960}, 0xc000ea0d00)
/home/runner/go/pkg/mod/github.com/go-chi/chi/v5@v5.0.7/mux.go:71 +0x355
github.com/go-chi/chi/v5.(*Mux).Mount.func1({0x2abd510, 0xc00088f960}, 0xc000ea0d00)
/home/runner/go/pkg/mod/github.com/go-chi/chi/v5@v5.0.7/mux.go:314 +0x19c
net/http.HandlerFunc.ServeHTTP(0x1f566e0?, {0x2abd510?, 0xc00088f960?}, 0xc0000fe2c4?)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2109 +0x2f
github.com/go-chi/chi/v5.(*Mux).routeHTTP(0xc001370120, {0x2abd510, 0xc00088f960}, 0xc000ea0d00)
/home/runner/go/pkg/mod/github.com/go-chi/chi/v5@v5.0.7/mux.go:442 +0x216
net/http.HandlerFunc.ServeHTTP(0xc000794dc0?, {0x2abd510?, 0xc00088f960?}, 0xc000ea0d00?)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2109 +0x2f
github.com/go-chi/cors.(*Cors).Handler.func1({0x2abd510, 0xc00088f960}, 0xc000ea0d00)
/home/runner/go/pkg/mod/github.com/go-chi/cors@v1.2.1/cors.go:228 +0x1bd
net/http.HandlerFunc.ServeHTTP(0x2abfcf0?, {0x2abd510?, 0xc00088f960?}, 0x3b40570?)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2109 +0x2f
github.com/go-chi/chi/v5.(*Mux).ServeHTTP(0xc001370120, {0x2abd510, 0xc00088f960}, 0xc000ea0c00)
/home/runner/go/pkg/mod/github.com/go-chi/chi/v5@v5.0.7/mux.go:88 +0x310
github.com/cortezaproject/corteza/server/pkg/api/server.(*demux).ServeHTTP(0xc0000fe310?, {0x2abd510?, 0xc00088f960}, 0xc000ea0c00)
/home/runner/work/corteza/corteza/server/pkg/api/server/demux.go:51 +0x113
net/http.serverHandler.ServeHTTP({0xc000aae990?}, {0x2abd510, 0xc00088f960}, 0xc000ea0c00)
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:2947 +0x30c
net/http.(*conn).serve(0xc0003bc140, {0x2abfd98, 0xc0002d6690})
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:1991 +0x607
created by net/http.(*Server).Serve
/opt/hostedtoolcache/go/1.19.13/x64/src/net/http/server.go:3102 +0x4db
================================================================================
Thanks for reporting this in such detail, we'll take a look asap.
@jackmaninov Could you check if your version is the latest (pull the docker again when on 2023.9.6) There were some patches to the image so might be that was fixed. We're still investigating and are unable to replicate the issue.
EDIT: Nevermind, was able to reproduce it finally.
@Fajfa Yes, I can confirm I am running the latest image:
REPOSITORY TAG DIGEST IMAGE ID CREATED SIZE
cortezaproject/corteza 2023.9.6 sha256:76db0186338728935d7710cbf869ee4a026156691e02c715517cd308a84f34d9 sha256:38eaa9776a32db905dc02bb4dc34afec19a1317926035a0f4ec7e4b5cf7e38e0 3 weeks ago 428MB
I realised I didn't describe my RBAC setup, just gave the sql dump, it may be helpful if someone comes across the issue in the future.
Basically on the debug module role Authenticated
should be able to create records, but are denied any other access to view/modify them. Then users receive access to their records through the conditional role I call Allowed Users
with the conditional expression I gave above:
resource.ownedBy == userID || has(resource.values.authorizedEditors ?? [], userID)
I use this setup so that users can access their own records as well as delegate access to (multiple) other users.
I have been investigating this issue, from your server logs I noticed that the issue could be related with that of created _at
value being nil, I went a further step to check on the sorting issue with timestamp fields specifically which have same timestamp up to the second mark but a difference in microseconds on pagination.
Currently, Corteza supports time precision only up to the second mark but saves the timestamp field up to the milliseconds mark (Postgres). This limitation causes a sorting issue when using fields such as created_at
, especially when multiple records have the same timestamp up to the second mark but differ by milliseconds.
An example of a sort SQL query on pagination that's generated with records less than 1000, after first page fetch looks like this:
SELECT
"compose_record"."values",
"compose_record"."id",
"compose_record"."rel_module",
"compose_record"."deleted_by",
"compose_record"."rel_namespace",
"compose_record"."revision",
"compose_record"."meta",
"compose_record"."owned_by",
"compose_record"."created_at",
"compose_record"."created_by",
"compose_record"."updated_at",
"compose_record"."updated_by",
"compose_record"."deleted_at"
FROM
"compose_record"
WHERE
(
("compose_record"."rel_module" IN (395060337979686913))
AND ("compose_record"."rel_namespace" IN (394182424018616321))
AND "compose_record"."deleted_at" IS NULL
AND (
(
("compose_record"."created_at" IS NULL AND 1=1)
OR ("compose_record"."created_at" < '2024-06-18T12:38:09Z')
)
OR (
(
("compose_record"."created_at" IS NULL AND 1=0)
OR ("compose_record"."created_at" = '2024-06-18T12:38:09Z')
)
AND (
("compose_record"."id" IS NULL AND 1=1)
OR ("compose_record"."id" < 395060337985257473)
)
)
)
)
ORDER BY
"compose_record"."created_at" DESC NULLS LAST,
"compose_record"."id" DESC NULLS LAST
LIMIT 1000;
I have been investigating this issue, from your server logs I noticed that the issue could be related with that of
created _at
value being nil,
Just looked at my sample dataset again, and none of the records have a null created_at
field, either when viewed in Corteza or directly in the sql dump, so I think this could only be occurring if there was some issue with Corteza loading records from the database.
I could understand an issue like this occurring as a result of some race condition or while bulk importing data, but all the records in my debug dataset were created manually, one at a time and several seconds apart.
Instead I think Go is hitting nulls on the created_at or ID fields because it is trying to read off the end of a list/set that it expects to have e.g. 14 records in it, but instead has fewer due to RBAC having filtered some of them off.
Stale issue message
Is there an existing issue for this?
Version of Corteza
2023.9.6
Current Behavior
I've been seeing a server panic intermittently on corteza server over the last two versions (also on 2023.9.5) in my production setup. The panic usually occurs while a viewer is on a module's record list page and loading a decent number of records (more than 10). Usually the user is unaffected, the server simply records the panic in its log and proceeds.
However, today I had a user complaining about a persistent 500 error popping up in the lower-right hand corner of their screen, and their records would not load. Each time they reloaded the record list, the same panic is written to the log:
The panics happen on the record lists of different modules, and looking at
corteza/server/compose/types/getters_setters.gen.go:401
it seems the issue is while trying to read the createdAt field on a record, so I don't think the specifics of the module matter. However, the line that usually leads to the panic is/home/runner/work/corteza/corteza/server/compose/types/getters_setters.gen.go:409
which reads the ID field.The issue does seem related to the number of records being fetched, as the user was able to get out of the situation and view their records by adding a new record. Amusingly enough, the problem was when they had a lucky 13 records; adding the 14th got past. There was only one other record of this module, owned by a different user, that wouldn't have been visible to the affected user. Some off by 1 issue?
There is nothing special about the modules that I've seen the error happening on, they two I'm sure of have 15 and 25 fields on them, respectively. What is special is that they're being listed to users with a conditional role (which is behaving as expected, not sure if it is related):
resource.ownedBy == userID || has(resource.values.authorizedEditors ?? [], userID)
Very stumped on this one, I'll need some guidance on what this could be related to and how I could attempt to generate a minimum reproducible case.
Expected Behavior
Records list as normal, no server panic
Steps To Reproduce
Not sure how I can reproduce this one, I'll need some guidance from the devs.
Environment and versions
Anything else?
No response