xssnick / tonutils-storage

TON Storage golang implementation
Apache License 2.0
49 stars 8 forks source link

file descriptors cache needs to be disableable / configurable #31

Open kolayuk opened 6 days ago

kolayuk commented 6 days ago

Cuz for now it could cause issues on fetching pieces on the other nodes like

2024/09/11 13:51:50 [STORAGE] LOAD PIECE FROM 192.168.1.36:10501 ERR: failed to query piece 4. err: response deadline exceeded, err: context deadline exceeded

because node is waiting infinitely, also limit is too low for the lib as server usually have unlimited file descriptors setting.

But for nor now it is impossible to manage it, because fs is private (being used in torrent.getInternalPieces) limit is const

Reproduce steps: 1) create multiple bags on a single process with ~10 mb file

tr, err := storage.CreateTorrent(ctx, c.rootStoragePath, user, user, c.progress, c.conn, refs, func(done uint64, max uint64) {
        if done == max {
            wg.Done()
        }
})
err = tr.Start(true, true, false)

2) request those files on Ton-Proxy multiple times, after 3-4 try download gets stuck

curl ... -x https://127.0.0.1:8081 http://<BAG_ID>.bag/output.mp4 -o output_d.mp4

> GET http://<...>.bag/output.mp4 HTTP/2
> Host: ....bag
> User-Agent: curl/8.4.0
> Accept: */*
> Proxy-Connection: Keep-Alive
> 
{ [5 bytes data]
  0     0    0     0    0     0      0      0 --:--:--  0:09:42 --:--:--     0^C

^ 10 minutes trying to download

in case of downgrade to 0.6.3 issue is not reproducable anymore

xssnick commented 6 days ago

Hi, thank you for the report!

@Gealber could you please check fs cache for this kind of issue?

Gealber commented 4 days ago

@xssnick I'll take a look at it

Gealber commented 3 days ago

@kolayuk could you provide a fully reproducible scenario, the code snippet you provided is quite incomplete. Just the most simple way to reproduce it. Would be quite helpful.

EDIT:

I think is not needed at the end, as long as the limit _FDLimit = 800 is reached, we will have this issue. The solution I have in mind for this is basically what @kolayuk asked for. To make the fs public and configurable from the CLI. To do so the only way that I'm seeing right now, is to add it as a field in Storage struct.

@xssnick can you confirm me if this makes sense to you.

kolayuk commented 1 day ago

Yes, if if would be possible to override fs without a limit / cache (custom implementation up to executable) - it would be a solution

xssnick commented 1 day ago

It is strange why it stuck and not downloads after some attempts, it blocks forever and never unlocks?

kolayuk commented 1 day ago

It will unlock probably at the point when FDs will go lower than 800, but it can take a lot of time if you're continue to serve all your uploads (or you have to manually stop some of them). But my point is this limitation makes some sense for consumer machines (as linux usually have 1024 default limitation) - but makes no sense to have it strict limitation in lib (not in cli) as lib can be used in server-side software and servers usually have 1M file descriptors limitation or

$ ulimit -n
unlimited

depends on kernel

Gealber commented 1 day ago

Yes, @kolayuk you are right the reason why its get locked is because the 800 limit is reached. And as long as fs.Free() is not called this counter of FDS won't decrease, hence not unlocking. About increasing the limit, I also agree on the point of at least making it configurable.

Gealber commented 1 day ago

It is strange why it stuck and not downloads after some attempts, it blocks forever and never unlocks?

The reason is because the limit is reached, and the call to Wait() in Acquire method will block until the current FD counter decrease. At least this is what I think. Maybe we should think about how to make this limit configurable

xssnick commented 1 day ago

As an option, we can make it configurable by command line argument, with default value to unlimited (or something really big), then if someone wants to tune it he can use this argument to set a specific value