andrewthad / rotera

Persistent rotating queue
BSD 3-Clause "New" or "Revised" License
8 stars 1 forks source link

segfault when persisting a connection to server #10

Closed chessai closed 5 years ago

chessai commented 5 years ago

i don't know enough about rotera's internals to debug this, but i went ahead and built it against primitive-checked 0.7, stood up rotera-server using systemd, and attempted to communicate with the server using the repl more than once. the result was this:

-- Logs begin at Sat 2018-03-10 17:57:06 EST, end at Sun 2019-06-09 20:07:03 EDT. --
Jun 09 20:05:33 chessai-kudu ngpyw70vm6cyfgxajvf3x2jnf53f6j42-unit-script-rotera-start[24642]:   writeByteArray, called at src/Rotera/Socket.hs:303:13 in rotera-0.1.0.0-4UQTIk01tfrCqol1LSIfOA:Rotera.Socket
Jun 09 20:05:33 chessai-kudu ngpyw70vm6cyfgxajvf3x2jnf53f6j42-unit-script-rotera-start[24642]:   check, called at src/Data/Primitive/ByteArray.hs:86:3 in primitive-checked-0.7.0.0-CFi7oENJ6EACxSiz4zzHFJ:Data.Primitive.ByteArray
Jun 09 20:05:33 chessai-kudu ngpyw70vm6cyfgxajvf3x2jnf53f6j42-unit-script-rotera-start[24642]: CallStack (from HasCallStack):
Jun 09 20:05:33 chessai-kudu ngpyw70vm6cyfgxajvf3x2jnf53f6j42-unit-script-rotera-start[24642]: rotera-server: array index out of range: Data.Primitive.ByteArray.writeByteArray: index of out bounds

So, it looks like the call to writeByteArray at src/Rotera/Socket.hs:303:13 is to blame.

chessai commented 5 years ago

that line is

PM.writeByteArray respBuf 1 (Fixed @'LittleEndian (fromIntegral nextEvent :: Word64))
chessai commented 5 years ago

@andrewthad do you have any insight into this? is this information helpful at all?

andrewthad commented 5 years ago

Dang it. The response buffer isn't large enough. I only allocate 8 bytes for it because I originally only sent the aliveness. But then I started sending the most recent committed message id as well and forget to resize the buffer accordingly. Fixing this now.

chessai commented 5 years ago

ok, i'll retry when you attempt the fix.

andrewthad commented 5 years ago

Fixed with 82adfee0725715794701ee672d1708cc46e9ae95. I've confirmed that pinging multiple times in from the repl no longer causes a segfault. Feel free to close this if you don't observe any other segfaults.

chessai commented 5 years ago

your push seems to have also included a change to sockets and primitive-checked, where you have dependencies on filesystem projects (your filesystem). could you fix that by uploading changes of relevant packages to github?

chessai commented 5 years ago

the repl now works. i'll open a separate issue about the fs objects.