PinataCloud / apollo

0 stars 0 forks source link

IPFS Enhanced Bandwidth Tracker #2

Open kyletut opened 4 years ago

kyletut commented 4 years ago

Prize Title

IPFS Bandwidth Tracker

Prize Bounty

2000 DAI

Challenge Description

Currently IPFS can track cumulative bandwidth with the ipfs stats bw command.

Pinata would like a way to track bandwidth utilized on a per CID basis. The most likely solution for this would exist as part of IPFS's plugin system: https://github.com/ipfs/go-ipfs/blob/master/docs/plugins.md .

Given the heavy logging requirements of per CID tracking, the best architecture we currently have for this is a system where every time a block is sent via Bitswap, the count for that CID being sent is increased by one. These counts can then be consumed in a queue like fashion where a function can be called which returns the counts of all CIDs that have had their counts increased since the last ask. When a CID count is returned this way, its count will be reset to 0 (which allows the ability to track only "new bandwidth usages".

The purpose of this setup is so that counts can be processed asynchronously by a process continually querying the node for new bandwidth usages and then processing the increased CID bandwidth usages however the outside process wishes to.

Submission Requirements

Judging Criteria

The IPFS Enhanced Bandwidth Tracker will be judged on its ability to meet the above criteria. Bonus points will be given for documentation / ease of deployment.

Winner Announcement Date

All submissions must be received no later than 11:59 PM EDT on Oct 8th, 2020, to be considered

gitcoinbot commented 4 years ago

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


This issue now has a funding of 2000.0 DAI (2000.0 USD @ $1.0/DAI) attached to it.

nitram509 commented 4 years ago

Just out of curiosity, wouldn't there be an option to extend "go-ds-s3" in a manner it counts the bytes transferred over time? As a result, one would know the bandwidth for the connection used by the ipfs client, no?

obo20 commented 4 years ago

@nitram509 Upon further discussion amongst our team, we just modified our bounty in a way that hopefully makes this bounty more generic / applicable to any IPFS node (not just s3 backed IPFS nodes). Please see the modified bounty.

nitram509 commented 4 years ago

Thanks for the info. In general, parsing logs on server-side have some pitfalls, e.g. according to the documentation, one only gets the request time from the logs. This means to derive the bandwidth, uploads must be strictly sequential. So, in case of parallel uploads, one can't derive the upload time.

Also, I read this ... Server access log records are delivered on a best effort basis. Most requests for a bucket that is properly configured for logging result in a delivered log record. Most log records are delivered within a few hours of the time that they are recorded, but they can be delivered more frequently. This means, one can't track bandwidth in real-time, but in the worst case a few hours later.

Besides, you believe in that's possible, may I ask what problem you try to solve? I still think there would be an option to extend "go-ds-s3" library in a manner it counts the bytes transferred over time? As a result, one would know the bandwidth for the connection used by the IPFS client, no?

obo20 commented 4 years ago

Thanks for the info. In general, parsing logs on server-side have some pitfalls, e.g. according to the documentation, one only gets the request time from the logs. This means to derive the bandwidth, uploads must be strictly sequential. So, in case of parallel uploads, one can't derive the upload time.

We don't actually need the upload times, just counts. We're not interested in when things are sent to the network, but rather "here's how many times each CID (block) was delivered to the network since we last checked"

Also, I read this ... Server access log records are delivered on a best effort basis. Most requests for a bucket that is properly configured for logging result in a delivered log record. Most log records are delivered within a few hours of the time that they are recorded, but they can be delivered more frequently. This means, one can't track bandwidth in real-time, but in the worst case a few hours later.

While this isn't necessarily an issue for us, we did chose to change this bounty to suggest it be built using the ipfs plugin system instead of AWS, as the end solution would apply to all IPFS nodes instead of just ones using an s3 backed store.

Besides, you believe in that's possible, may I ask what problem you try to solve? I still think there would be an option to extend "go-ds-s3" library in a manner it counts the bytes transferred over time? As a result, one would know the bandwidth for the connection used by the IPFS client, no?

We're looking to understand and track how much specific content is being delivered by our nodes around the world in order to better optimize our network of IPFS nodes. We know how much "total bandwidth" a node uses, but IPFS doesn't allow a way to track "per CID bandwidth", which is what we desire from this proposed bounty solution.

gitcoinbot commented 4 years ago

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work has been started.

These users each claimed they can complete the work by 1 week, 3 days ago. Please review their action plans below:

1) tzdybal has started work.

Add generalized "hook" for (statistics) plugins in go-bitswap. Create ipfs-plugin that use data from go-bitswap to implement "per CID bandwidth tracking".

Learn more on the Gitcoin Issue Details page.

gitcoinbot commented 4 years ago

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work for 2000.0 DAI (2000.00 USD @ $1.0/DAI) has been submitted by: