application-research / estuary

A custom IPFS/Filecoin node that makes it easy to pin IPFS content and make Filecoin deals.
https://docs.estuary.tech
Other
239 stars 67 forks source link

SP Selection Improvement #545

Open jcace opened 1 year ago

jcace commented 1 year ago

Expanding on the SP side of @alvin-reyes 's issue on shuttle selection improvement:

Problem

For miners, there is no way to optionally specify their location that could give them a better chance of sealing a deal for large content from a more favourable (nearest) shuttle.

This results in an issue where SP's can receive a lot of deals from an estuary shuttle that they have a very slow/limited connection to. The deals have no chance of being completed, and the available bandwidth will be consumed trying to stream many deals all at once.

189777069-8ba1e19d-1792-4d30-b17b-5c99d30f70c9

Current geographic distribution of SPs (blue) and Estuary shuttles (red):

shuttles and sps

Performance issues arise when a shuttle streams data to an SP far away (in the example above, SP node in Vancouver being dealt 138 deals from Shuttle-5 which is in Tokyo)

When a fast shuttle streams deals to a miner it's a beautiful sight πŸ₯Ή. Downloads complete in mere minutes.

188779220-231a9c82-1239-4dc9-a42d-6e58c9f0e5a0

Potential Solution

Model Update: Shuttle Preference

The StorageMiner struct should be updated to contain an additional field ShuttlePreference. This would be a slice containing Shuttle identifiers, in order from most preferred to least preferred.

Replication: Miner Selection

Currently, SP selection for dealmaking is essentially random. Update the Miner Selection code to consider shuttle preference.

For each content to be replicated,

1. Determine where the content lives (what shuttle)
2. Order miners for the deal based on **ShuttlePreference**
    1. Consider all SPs that have the shuttle as *most preferred* first, and attempt to make deals with any eligible ones
    2. If the deal can't be made with any of those SPs, then consider all miners that have the shuttle as 2nd most preferred
    3. etc...

N.B. I think this concept of shuttle preference / location-based dealmaking should be separate from the reputation system that we build in the future. We can still apply reputation logic when determining which miners to use within each shuttle/location. Location isn't something the SP has control over. If a reputation system is going to be used to incentivize good behaviour, then it should only consider metrics that the SP can monitor and improve (uptime, time to seal, # of deals accepted, retrievability, etc).

Determining ShuttlePreference

Manual Approach

Initially, this can be accomplished with a simple UI for the miner to order shuttles on their Estuary dashboard. A reorderable list component like this:

list-cover-photo

The list would contain Shuttle identifiers, IP address and geographic region. That way, miners could self-order by geographic proximity but also perform ping tests if desired and order based on network speed.

On the backend, a protected API endpoint could be added under /miners/shuttle-preference to allow the preference to be specified programmatically.

Automated Approach

A couple of ideas for automating the ShuttlePreference.

  1. Collect stats for each SP based on historic deals - # accepted, time to transfer, dropped connections, etc. Use that information to automatically refine the Shuttle preferences.
  2. Create a task that runs on each shuttle to ping all SPs, and order them by lowest ping. note: this approach would result in s(shuttle) x m(miner) scalability issues as each miner would need to be pinged from each shuttle.
  3. Create a task that geolocates each SP based on ipinfo.io geo report. Figure out shuttle ordering based on geographic proximity note: geographic proximity may not necessarily be correlate with performance, depending on internet connectivity/routing
  4. Create a small open-source tool estuary-shuttle-bench, make it available for SPs to download and execute on their node. The tool would ping/run a speedtest against each of the shuttles, order them, and make a call to the /miners/shuttle-preference API to set shuttle preference automatically.
neelvirdy commented 1 year ago

@jcace i personally am the biggest fan of the automated approach number 2. we can find ways to prevent the scalability issues i.e. stochastic sampling of the SP pool + periodically resampling. it doesn't sound too hard to write and only requires an estuary code change, meaning brand new estuary nodes and SPs will get the benefit by default, and no one needs to track anything manually

alvin-reyes commented 1 year ago

I certainly like the automated approach. I think we can apply the same logic for miner and shuttle selection.

We use the multiaddresses for miners, and use the equinix API for the shuttles (I have this on my metrics api). We can get the long/lat of each addresses and use that as the basis.

I imagine we will have an input like this:

Output:

I imagine that if we have this, we can use it as a way to reliability look up miners for either making storage or retrieval deals.

Let me know thoughts.

10d9e commented 1 year ago

I like both approaches tbh, however if the automated solution is smart enough, we may not need the first. The scoring could probably be some combination of IP geolocation and performance ping/historical stats with weightings as we see fit. Also might be able to make these configurable by the estuary administrators. For example:

alvin-reyes commented 1 year ago

I think we can start working on a standalone library for this. sp-selection-service and start just building the scaffolding. I assume you would need a DAO to access the DB to get the miner list.

I was thinking we should have the core function (and exposed them via endpoints):

*shuttles will have the same and we can use it as an alternative to the "/viewer" endpoint.

The inputs can be configurable by admin. We can add more param input to it and we can "protect" this endpoint with Estuary-Auth.

Let me know your thoughts.

jcace commented 1 year ago

Do we actually care where the SPs are located?

For example imagine:

In this scenario we want to give SP B additional weighting because of its geographical location, despite it probably having worse download performance? Are we trying to provide a method to localize the data within a certain region, to support client needs?

Reason for considering this is that the map-based geolocation forces us to rely on an external service for the IP-> Coordinates mapping (ex, ipinfo.io, ipgeolocation.io, etc). While free for < 150K requests per month, it requires a subscription if we start doing lots of queries and will require us to provision an API key. I couldn't find any reasonable open source methods to do that - so it introduces a bit of centralization there.

Doing a strict ping/bandwidth based selection simplifies this, as all we care about is bandwidth between the shuttle and target SP (this metric also pertains to to deal transfer / retrieval times). We forego the concept of geographical location, but I presume that this will be captured by the bandwidth tests. Also simplifies it from the white label / Estuary fork perspective, if standing up a new instance of Estuary does not require an ipinfo API key to be provisioned.

Thoughts?

jcace commented 1 year ago

I was thinking we should have the core function (and exposed them via endpoints):

  • POST /miner/list. Optional Input: source IP-address OR long-lat of the source. The logic would be query list of reputable miners (query 2 above). Each miner, we compare the location based on the input source IP or long-lat.
  • POST /miner/most-deals/ - return the result of query 2 above - just so the user can pick a miner from the most reputable.
  • POST /miner/most-loyal/ - return the result of the query 3 above.
  • POST /miner/most-uptime - return the result of the most with uptime (I'm not sure if we can get an info like this on api.chain.love).
  • POST /miner/most-reliable - optional input of IP / long lat. Then top result of query 2, 3 and uptime with @jlogelin percentage.

I like the microservices idea, but I wonder if it makes sense to have a single endpoint (maybe GraphQL?), giving a list of all providers with their associated stats. We could come up with some input parameters when making the request (ex, location), to help filter it down, and perhaps put in some pagination too.

Ex)

POST /storage-providers

Returns:

[
  {
    "id": "f01234",
    "addr": "/ip4/1.1.1.1/tcp/1111",
    "deals": {
      "open": 23,
      "sealed": 412,
      "slashed": 6
    },
    "location": {
      "lat": -41.123,
      "lon": 110.456,
      "city": "NY",
      "region": "NY",
      "country": "USA"
    },
    "uptime_score": 0.9923, // % uptime based on some polling metric
    "shuttle_connections": [
      {
        "name": "shuttle1",
        "addr": "/ip4/2.2.2.2/tcp/2222",
        "ping": 64.21,
      }
      ...
    ]
  }
  ...
]

The SP sorting/selection; most-reputable, most-uptime , etc.. is up to the caller. This service is just an unopinionated source of SP performance and geo information, caller can choose how they want to slice it up and use it.

jcace commented 1 year ago

Had a call with @alvin-reyes @en0ma @jlogelin :

  1. We are going to draw the line and constrain our functionality to SP index/selection, not reputation. Those reputation services are already being developed elsewhere and we can integrate with them in the future if necessary.
  2. Generally speaking, this new sp-selection will operate like:

Input: Location Output: List of SPs, sorted/ordered in terms of closeness/reachability

  1. Service vs Library? Still undecided. We may start as a library in Estuary for simplicity and the ability to enrich with Estuary-specific data, but keeping in mind that we may pull it out into a service later.
jcace commented 1 year ago

Did some research re: the geo-location aspect of this design.

I think the mapping of SP->Country could be useful as a filter parameter. There are some real use cases where data residency matters, and being in the same country likely means there's good network connectivity. However, I don't think it should be our only data point, as it can be wrong in some cases and irrelevant in others. I think the ping-based metrics for determining network speed are still needed.

What's everyone's thoughts about this approach to implementing this geolocation feature (country only)? It will require a some work to port over the library and add IPv6 support.

@alvin-reyes @en0ma

jcace commented 1 year ago

Did a bunch of research and testing recently, formulated a revised problem statement and plan - Let me know what you think!

Problem

Currently, Estuary lacks a system to route data streams in an intelligent way that maximizes data transfer speeds.

This results in suboptimal, highly-variable upload speeds for clients adding content to Estuary, and during dealmaking when files are transferred to Storage Providers.

Solution

All file transfers (whether HTTPS upload or libp2p) utilize TCP as the underlying transport protocol. As TCP is a stateful protocol, latency has a direct correlation with upload speeds.

For instance, data transfer over a connection with 30ms RTT may cap out at over 300Mbit/sec, whereas a connection with 150ms RTT would be scaled back to 140Mbit/sec. This is a more than 50% reduction in transfer speed, and is very possible given the global distribution of Estuary nodes and participants in our ecosystem.

Thus, to solve this problem, we need to measure the latency (ping) between nodes in the Estuary platform, and choose the lowest latency destination.

For content uploads, this means Clients should ping all of our Shuttles, and choose the one with the lowest latency to upload to.

For dealmaking, this means our Storage Providers should also be pinged, and selected based on their latency to the shuttles.

How?

Storage Providers

Storage Providers will be provided with a simple, open-source Go program or shell script to be executed on their node. This program pings all of our shuttles, orders them based on lowest latency, and makes a call to a /sp-preference API with the shuttle precedence order.

Example Script Output from running on my SP node f01886797

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ shuttle-4.estuary.tech β”‚ 84.00  β”‚
β”‚ shuttle-5.estuary.tech β”‚ 86.00  β”‚
β”‚ shuttle-6.estuary.tech β”‚ 114.80 β”‚
β”‚ shuttle-7.estuary.tech β”‚ 104.33 β”‚
β”‚ shuttle-8.estuary.tech β”‚ 111.00 β”‚
β”‚ shuttle-1.estuary.tech β”‚ 28.00  β”‚
β”‚ shuttle-2.estuary.tech β”‚ 85.00  β”‚
β”‚ shuttle-3.estuary.tech β”‚ 89.00  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Shuttle precedence order: 
1 | shuttle-1.estuary.tech 
2 | shuttle-4.estuary.tech 
3 | shuttle-2.estuary.tech 
4 | shuttle-5.estuary.tech 
5 | shuttle-3.estuary.tech 
6 | shuttle-7.estuary.tech 
7 | shuttle-8.estuary.tech 
8 | shuttle-6.estuary.tech 

On the Estuary side, we'll "pivot" the data, assigning each SP node to a priority bucket/order based on each shuttle.

The data will look something like this:

source priority 0 sp priority 1 sp priority 3 sp
shuttle-1 f01234,f4567,f8901 f9999,f1212 f8821
shuttle-2 f9999,f1212 f8821 f01234,f4567,f8901

When making deals, the priority 0 / best SPs can be attempted first, before moving to the next priority bucket and so on.

Note 1

Only the shuttle precedence order will be captured. Ultimately, we only need to know the ordering and not the exact latency value. As the ping is initiated client-side, we don't want any incentive for SPs to "forge requests", attempting to game the system by setting all latencies to 0ms, for instance.

Note 2

Why Client-side? Why not initiate the ping request server-side?

In short, because it's less complicated. If we initiate pings from our side, then the burden is on SP's to ensure their networks are configured properly to respond to ICMP requests on their public-facing router/firewall. They may not want, or be permitted to enable this functionality for security reasons. It will result in more administrative overhead on the Estuary teams as we support/troubleshoot SP's that do not show up in our system.

Initiating the connection from the SP side will work every time, as we control the server side of the equation. SP Experience could be as simple as requesting them to run a single command, i.e:

wget https://github.com/application-research/sp-benchmark.sh -O - | sh

This is a fairly normal thing to ask Storage Providers to do. We have tons of other services that we download and run to interface with different marketplaces, like bidbot, FilSwan Provider, and Evergreen/Spade requires custom scripting to pull deals from their API

Clients

The client side of the equation is quite straightforward. We'll use the same principle to benchmark connectivity between the client and the shuttles, and pick the one with the lowest latency to direct the file upload towards. This can be done in the browser, in the background, without any input from the user. I've verified that the principle works with a simple HTML page.

Browser-based latency benchmarking is possible with a simple websocket. ICMP ping will not work as it can't be initiated from browser javascript code.

This shuttle-ordering information could be added to the user's account metadata, so it can be automatically taken into account when direct API calls are made and the UI is bypassed.

Low-level task breakdown

  1. Estuary Handlers
    1. Shuttle list needs to be exposed to all Estuary users via API (i.e, make /admin/shuttle/list more openly accessible).
    2. Content /content/add -> add a param/make it aware of destination shuttle selection (Not sure if already supports this)
  2. Estuary sp-shuttle-preference service
    1. New table -> SP-Preference
    2. POST / endpoint -> add or update an SP's shuttle preferences
    3. GET / endpoint -> return back an SP's shuttle preferences
    4. GET /LOOKUP endpoint -> based on "source shuttle", return slice of SP's that are bucketed/ordered in terms of their priority
  3. Estuary ensureStorage / Dealmaking
    1. Modify to query the sp-shuttle-preference service, and use the ordered SP list to intelligently make deals that prioritize the SPs with highest preference for the shuttle making the deal.
  4. Build out "Satellite" project (tooling for the latency benchmarking)
    1. Client lib (js/react)
    2. SP tool - simple script/program to do the shuttle latency benchmarking
    3. Server-side (potential docker container) to run Shuttle end of the websocket
  5. Estuary-www
    1. SP dashboard -> show SP priority list, tooling download link, notification if unset
    2. User upload dashboard -> integrate client-side websocket approach to find best shuttle for uploads, attach to requests

Why not geolocation?

During initial investigation into this problem, we considered using traditional geolocation (country/city/region, latitude and longitude) data points to make the routing decisions. This approach would utilize an IP address<->geolocation service.

This approach is suboptimal for a few reasons:

  1. Often, the geolocation data is missing details or is entirely incorrect.
    • During testing with ipfs-geoip, my SP node's IP address mapped to Phoenix Arizona, USA, despite being physically located in Vancouver BC, Canada. That's an error of about 1500 miles, and not even in the same country.
    • Attempting to geolocate some other SP's in our network did not even yield a lat/lon/region, only a country (ex, USA). We currently have three shuttles in the USA, so this does not provide enough resolution for making these routing decisions.
  2. More accurate geolocation services are available, but they are centralized, closed-source, proprietary, and charge license fees.
    1. While the most accurate service ipinfo.io did manage to locate my SP node correctly in Vancouver, utilizing this service would introduce a degree of centralization into our services and cost us money.
    2. Even still, we have no guarantee that these premium services are accurate
  3. Closer geographic proximity does not guarantee lower latency.
    1. The internet is a complex web of peering and transit relationships between ISPs. The path that packets take through the internet does not always correlate with the physical distance between source and destination. You may have a better connection to a server that is 1000 miles away, as compared to one that is in a building right next to you.

Geographic location is useful information, as there do exist data residency requirements in certain industries and countries. This could be a valuable use case to support, but it is a different problem from the one we're trying to solve here.

The Filecoin Spade (Slingshot/Evergreen) program is an example of one that does use SP geographic information to ensure geographic distribution of files stored. However, signing up to participate in the program requires official documentation (datacenter lease, Internet contract) to make this attestation. That introduces a considerable amount of administrative overhead.

Opinion/Conclusion - as I see it, Estuary is currently an international, borderless ecosystem. We don't need to know or control where data is going in a certain geographical region, and doing so doesn't provide us with the optimal SP/Client experience. Our only driver right now is to maximize the speed at which clients can upload data, and the speed at which that data makes it on to the Storage Providers. That's best accomplished with by optimizing for network latency above all else.

10d9e commented 1 year ago

@jcace - it looks like we can interrogate the SPs letancy through our libp2p node (estuary main and/or shuttles). See https://filecoinproject.slack.com/archives/C016APFREQK/p1670270536095929?thread_ts=1670270103.300649&cid=C016APFREQK

jcace commented 1 year ago

Just tested the libp2p ping functionality and it seems to work great!

This allows us to ping SPs from our side as it does not require any additional configuration from them, if they have Lotus set up properly.

Will rework the game plan when I create tasks to use this approach instead.