PretendoNetwork / friends

Pretendo Network friends server
GNU Affero General Public License v3.0
29 stars 9 forks source link

fix(database): Proper checks on QueryRow for sql.ErrNoRows #17

Closed DaniElectra closed 4 months ago

DaniElectra commented 4 months ago

With the transition to our sql-manager, the QueryRow functions had to be modified to make the Scan call later instead of inlined as it was previously. This affected all checks for sql.ErrNoRows, since this is returned by Scan and not QueryRow. Move the error check to Scan to fix the issues.

jonbarrow commented 4 months ago

Also one of the errors we're getting is:

[CRITICAL] [func AddBlackList() github.com/PretendoNetwork/friends/nex/friends-wiiu/add_blacklist.go:43] : pq: there is no unique or exclusion constraint matching the ON CONFLICT specification

I fixed this manually by adding a unique constraint through pgAdmin, but I forgot to actually commit the changes to the DB setup. I used:

ALTER TABLE wiiu.blocks
ADD CONSTRAINT unique_blocker_blocked UNIQUE (blocker_pid, blocked_pid);

And while we're here, it looks like my previous changes also introduced a new panic? It seems to only occur when my PNID logs in, I haven't seen any other client trigger this panic:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x50e3a4]
goroutine 1583924 [running]:
github.com/PretendoNetwork/nex-protocols-go/v2/friends-wiiu/types.(*NintendoPresenceV2).WriteTo(0x0, {0x850240, 0x40056dd500})
    /home/jonbarrow/go/pkg/mod/github.com/!pretendo!network/nex-protocols-go/v2@v2.0.1/friends-wiiu/types/nintendo_presence_v2.go:34 +0x24
github.com/PretendoNetwork/nex-protocols-go/v2/friends-wiiu/types.(*FriendInfo).WriteTo(0x4001b60b80, {0x850240
, 0x40056dd500})
    /home/jonbarrow/go/pkg/mod/github.com/!pretendo!network/nex-protocols-go/v2@v2.0.1/friends-wiiu/types/friend_info.go:30 +0x7c
github.com/PretendoNetwork/nex-go/v2/types.(*List[...]).WriteTo(0x856300, {0x850240, 0x40056dd500})
    /home/jonbarrow/go/pkg/mod/github.com/!pretendo!network/nex-go/v2@v2.0.1/types/list.go:23 +0x78
github.com/PretendoNetwork/friends/nex/friends-wiiu.UpdateAndGetAllInformation({0x0?, 0x0?}, {0x84a5c0?, 0x4002d58990?}, 0x13, 0x40022a3ce0, 0x4002d58bd0, 0x4003af99b8)
    /home/jonbarrow/friends/nex/friends-wiiu/update_and_get_all_information.go:172 +0x13c8
github.com/PretendoNetwork/nex-protocols-go/v2/friends-wiiu.(*Protocol).handleUpdateAndGetAllInformation(0x40000d6460, {0x84a5c0, 0x4002d58990})
    /home/jonbarrow/go/pkg/mod/github.com/!pretendo!network/nex-protocols-go/v2@v2.0.1/friends-wiiu/update_and_get_all_information.go:65 +0x320
github.com/PretendoNetwork/nex-protocols-go/v2/friends-wiiu.(*Protocol).HandlePacket(0x40000d6460, {0x84a5c0, 0x4002d58990})
    /home/jonbarrow/go/pkg/mod/github.com/!pretendo!network/nex-protocols-go/v2@v2.0.1/friends-wiiu/protocol.go:257 +0xa0
created by github.com/PretendoNetwork/nex-go/v2.(*PRUDPEndPoint).emit in goroutine 1583921
    /home/jonbarrow/go/pkg/mod/github.com/!pretendo!network/nex-go/v2@v2.0.1/prudp_endpoint.go:85 +0x88

It looks like when trying to get my friend list, one of the items has bad presence data? The panic occurs in FriendInfo.WriteTo on the fi.Presence.WriteTo(writable) call

Specifically on npv.Data.WriteTo(writable)? It seems like Data in the NintendoPresenceV2 is nil for some reason?

I wonder if this is some form of race condition. Like the server tried to write my friends list data, but the same time one of my friends went offline and thus their data was niled out? But that wouldn't explain why I haven't seen other clients trigger the panic.

If it IS a race condition like that though, then it shows a pretty big flaw in the way we designed these structures which I had not considered until now. We use pointers everywhere, which WOULD run this risk if we always just pass the same pointer around all the time. A fix would be to no longer use pointers everywhere, so everyone has their own hard copy of the data. But that would come with some higher memory consumption and we'd have to rework just about everything in all 3 libraries.

Alternatively we can make sure we always use .Copy() everywhere, to avoid having to rework everything, but it still has the memory usage issue.

DaniElectra commented 4 months ago

pq: there is no unique or exclusion constraint matching the ON CONFLICT specification

Not sure I understand this? There is already a UNIQUE constraint on wiiu.blocks

imagen

And while we're here, it looks like my previous changes also introduced a new panic?

I'm not sure how that panic is happening tbh. I guess it could be a race condition like you say, but I'd prefer to investigate on that later and merge this first

jonbarrow commented 4 months ago

Not sure I understand this? There is already a UNIQUE constraint on wiiu.blocks

I also didn't know why this started happening, I came to the same conclusion you did. But after adding a named constraint it went away?

EDIT: Nevermind actually, I see the real issue. The SQL is fine as it is. This is actually from an old change from over a year ago and I never altered the table I guess. @hauntii had added this unique constraint in another PR https://github.com/PretendoNetwork/friends/pull/2/files#diff-167f5c909c22ff554652b2a6b21ec9d0360376b47a42af5cd627f0261590651b

I'm not sure how that panic is happening tbh. I guess it could be a race condition like you say, but I'd prefer to investigate on that later and merge this first

That sounds reasonable :+1: just wanted to make you aware