Open martinheidegger opened 6 years ago
We accomplished something like this with https://github.com/karissa/hyperhealth . Everything you need is in hypercore already
Found the mafintosh ui for this: https://github.com/mafintosh/hypercore-stats-ui
and the @joehand one: https://github.com/joehand/dat-stats
@karissa Thank you for your input, I have read through most of it but to my knowledge none of those actually take care of the issue I tried to fix:
something like this with https://github.com/karissa/hyperhealth
It does accumulate the health of the dat but only for the connected nodes, not for disconnected, former nodes. I.e. you don't get to see the health of a DAT once you lost connection.
the @joehand one: https://github.com/joehand/dat-stats
Doesn't that just expose your other package that I didn't know about? https://github.com/karissa/hypercore-stats-server
It looks like both of those are also using a quite old version of dats.
ui for this: https://github.com/mafintosh/hypercore-stats-ui
This is a beautiful UI, indeed I havn't seen it before, and while it is totally awesome & I gotta need to read the source code of it I don't see how (like mentioned above) it would be able to show the actual replication state, rather than the "currently connected replication state".
@martinheidegger we are trying to work through some of these issues in the CLI too, but we are focused more on the uploading side (trying to answer, does my server have all the data and can I close this process/close computer?).
To clarify that I understand, it seems to me there are a few discrete pieces to solving the download side of things:
Somewhat along these lines, https://github.com/joehand/dat-push is probably the closest work I've done on this. The hardest part here is figuring out what "Done" means when pushing. For example, a user may be sparsely replicating. I may have pushed just 5 blocks but that is all the user wants, so I should say a push is "Done". But that is hard to tease out =).
trying to answer, does my server have all the data and can I close this process/close computer
This is a tricky question because you don't know in the CLI what the user intended to upload. Maybe she only wanted to provide the dat so some other client can get some sparse data? Maybe she wanted two peers to take the some data of the current version? Maybe she wanted to trigger a backup process? Maybe a separate command might be a good idea
$ dat push-backup
Then the CLI could seed until one connected client has the whole bitfield of the version locally replicated. To establish that is different than to establish if the bitfield was ever replicated somewhere.
One bit can have multiple states:
Both for upload and download it seems like it is necessary to know this for the entire bitfield in order to decide what to upload and what to download.
maybe I can tell by if they have 100% of the blocks)
100% of the blocks of the current version, right? What I am trying to get at is: Different cloud-types may have different versions, our download-peer is looking for the last-know-fully-replicated version in the swarm as one peer could have 50% of the latest version but another one could have 100% of the version before that, then it should download the 100% of the former version.
From a user-perspective of DAT, one thing problematic at the moment is that replication feels not certain/stable. The green bar shown in DAT-Desktop only signals if its replicated once and how many peers are listening. It doesn't mention what nodes are connected (particularly, a backup-node) or whether all of the peers replicated all data of all versions. Also it doesn't mention if second level or higher peers (nodes only connected to a peer, but not to the original client) have replicated data, which might very well happen if a node is disconnected for a while.
One approach I could think of:
The Problem with this approach would be that every client would need to connect/store every other clients data. Which - even assuming sparse downloads - could mean a explosion in dat connections and data downloaded. Also: malicious clients could store malicious data (i.e very big) in the progress-data; Which could probably prevented with a size-limit?!
Now, with that in mind: How would you solve that? Would this make sense at a lower level?