GET response truncation issues

noornee commented 9 months ago

The default size of your byte slice (1024) is not sufficient to hold the entire response.

e.g. when you do

make build
./build/scour example.com

Screenshot_2023-12-28-06-49-34-630_com termux-edit

The response gets truncated because the byte size is not sufficient enough. You could increase the byte size to 2048 and it would work but It isn't efficient that way because there could be a situation when someone tries to scour a bigger site.

You can address this by dynamically resizing the byte slice by using io.ReadAll

func ReadAll(r Reader) ([]byte, error) ReadAll reads from r until an error or EOF and returns the data it read. A successful call returns err == nil, not err == EOF. Because ReadAll is defined to read from src until EOF, it does not treat an EOF from Read as an error to be reported.

//res := make([]byte, 1024)
//i, err := resp.Body.Read(res)
body, err := io.ReadAll(resp.Body)
...

noornee commented 9 months ago

andddd in get.go line 30

you could actually achieve the same result without using the anonymous function by doing

defer resp.Body.Close()

dark-enstein commented 9 months ago

Thanks for flagging this @noornee. Funny enough, I implemented a fix for this issue in invoke/socket.go here. It implements a way to grow the buffer dynamically.

    var buf bytes.Buffer
    tmp := make([]byte, RCV_PAGESIZE)
    for {
        n, err := c.conn.Read(tmp)
        if err != nil {
            if err != io.EOF {
                fmt.Println("Error receiving response:", err.Error())
                buf.Write(tmp[:n])
                return buf.Bytes(), err
            }
            break
        }
        buf.Write(tmp[:n])
    }
    return buf.Bytes(), nil

Note: I am trying to optimize on read memory during each iteration. Also I will be increasing the read buffer size to 2048 to enable more reads per iteration.

What do you think?

noornee commented 9 months ago

Ahhh, I see. The code looks great! You're Welcome and Happy coding to ya ^^

dark-enstein commented 9 months ago

Since the fix for this is fairly straight forward, I should have a fix sooner.

dark-enstein commented 9 months ago

Hey @noornee, new PR here. Any thoughts?

Sidenote: I really need to set up some CI. Will create an issue for that.

dark-enstein commented 9 months ago

Merged in

noornee commented 9 months ago

Hiii @dark-enstein

so i switched to the sco-1 branch and tried it out. It works! but there's a catch. for some reason it didn't work for a site like example.com. turns out it's because the READ_PAGESIZE was set to 2048. I then changed the READ_PAGESIZE to 1024 and it worked for some reason idr understand.

I asked chatgpt about it.

I really think we should just go for using io.ReadAll to get the body/bytes. it kinda of makes the code look less complex.

so instead of

    var buf bytes.Buffer
    tmp := make([]byte, READ_PAGESIZE)
    for {
        n, err := resp.Body.Read(tmp)
        if err != nil {
            if err != io.EOF {
                fmt.Println("Error receiving response:", err.Error())
                buf.Write(tmp[:n])
            }
            break
        }
        buf.Write(tmp[:n])
    }

we do

buf, err := io.ReadAll(resp.Body)
...

also line 31 needs to be changed from

    defer func(Body io.ReadCloser) {
        _ = Body.Close()
    }(resp.Body)

to

defer resp.Body.Close()

dark-enstein commented 9 months ago

Yes, that makes sense, io.ReadAll already handles this partial read buffer problem here by using a read block size of 512 by default, given that io.ReadAll essentially does this and is pretty reliable, I will be switching to io.ReadAll. Thanks for advocating @noornee

I'll also revamp the defer syntax

noornee commented 9 months ago

Alrighty then!

dark-enstein commented 8 months ago

Hey @noornee, what do you think? https://github.com/dark-enstein/scour/pull/5

noornee commented 8 months ago

Hii @dark-enstein, it looks good to me. well done!

dark-enstein / scour

GET response truncation issues #3