mrd0ll4r / chihaya-privtrak-middleware

A private tracker middleware for the chihaya BitTorrent tracker
MIT License
9 stars 1 forks source link

Include seeding time in StatDelta's #1

Open ipsrt opened 5 years ago

ipsrt commented 5 years ago

Many private trackers track not only upload and download deltas in each announce, but also the incremental seeding time. This is used to enforce site rules (eg. torrents must be seeded for a minimum of 7 days) or as part of a site currency (eg. every hour each user gets points based on how many torrents they're seeding and for how long, then the points can be used to improve the users seeding ratio or to buy features on the site)

Since userSwarmStats structs already track the time of the users last announce, to me it makes sense to integrate this delta seeding time into this middleware.

Let me know your thoughts on how & where to include a seeding time feature into the middleware.

mrd0ll4r commented 5 years ago

Oh nice, someone is interested in this!

I read through your PR and thought about this for a bit. The main issue I'm seeing is that, if we now generate a delta for every announce (instead of only for announces with actual deltas in uploaded/downloaded), we would potentially generate a massive amount of deltas. To me, this kinda defeats the purpose of recording deltas in the first place - we could just report out all announces straight to some event processing system and throw queries at that.

Also, if we emit a delta for each announce anyway, the consumer can calculate the seed time on their end, based on the reported timestamps, or something like that...

We could, for example, instead add a dirty bit to the userSwarmStats, and also a seedTimeSinceLastDelta or something like that. Then, when we receive a Stopped event or clean up the userSwarmStats due to GC, we would check the dirty bit and, only once, emit a delta with DeltaUp==0 and DeltaDown==0. We should probably still rely on updating the seedTimeSinceLastDelta on each announce, and shouldn't report (current time - last received announce) or something like that, in case peers leave silently, or die.

Alternatively, we could emit a second type of event, namely something that keeps track of seed time.

What do you think?

PS: Also, I checked back into IRC today, I'm always idling there and read your messages, I'll see if I catch you at some point, otherwise this works fine too. About your question on IRC: Nope, I don't know of anyone who is using this. I just thought it up as a sane way to potentially handle the whole private tracker situation. It's not super nice (probably eats a lot of memory), but again, never tried anywhere.... who knows.

ipsrt commented 5 years ago

My first thought was also to just take the StatDeltas as they are and work out the seeding time backwards from those. The issue I see with this is that in some trackers it's common for seeds to have 0 seeding activity for weeks on end. I.e., the tracker would emit 0 stat deltas over that period. So if the tracker would crash or something like that, all the seeding time progress would be lost for that peer, up until the most recent stat delta that was processed for them I get your concerns about emitting a huge amount of deltas. To me it makes sense to not have seeding time functionality in a seperate middleware since all the sharding and infohash sorting plumbing is set up (and memory allocated) already for the StatDelta middleware. Emitting a second event just for seedtime would make sense as well to me. Or maybe a config option that sets whether or not seeding time is tracked at all? I'm sure you'd know better which approach would be more sane from an organisation & performance standpoint

mrd0ll4r commented 5 years ago

Alright, here's what I thought up:

Let me know if you think I forgot something.

Why I think this should work well:

Grand total: Some additional events, of which the rate and granularity can be configured; some additional storage; some additional complexity.

Optional: We might make it so that people can disable this functionality. We cannot save the space in the struct, but we can skip on the added complexity and we can not do the periodic flushing.