0chain / blobber

A storage provider (blobber) interface to the blockchain and consumers of storage.
Other
19 stars 23 forks source link

Blobber can mess with the clients. #277

Closed lpoli closed 1 year ago

lpoli commented 3 years ago

There are challenge protocols to keep blobber in check and it is related to file content. The issue however is the metadata about the files in blobber database. Everywhere in the gosdk Consensus is checked against file content but the fileMetadata is also crucial to meet the concensus. And I think fileMetadata is not considered to be challenged.

Consider a scenario where user uploads a file to blobbers with 2-data and 1-parity shards and consider a case for blobber "reference_obbjects" table. If two of the blobbers changed some metadata but didn't change say pathHash, ActualFileHash, etc.and client requests for metadata and there are some inconsistencies, we have no way to fetch correct data if we don't check all the fields. One way would be to calculate consensus for each column value from all blobbers response and consider only the field with highest consensus rate. By the way Blobber can also feed metadata request from separate database other than main database so there is no challenge fail.

We can consider calculating hash of all the responded fields and compare each hashes against one another. But Network should also challenge Blobbers for the fileMetada.

shaktals commented 3 years ago

I'm new here and pretty ignorant about 0chain / blobbers internals, but including metadata hashes on the consensus sounds like a good idea.

taustin commented 3 years ago

Correct me if I am wrong, but don't the write markers connect to the metadata? Assuming the markers are signed by the client, that should prevent the blobber from presenting fraudulent metadata for challenges.

One additional thought -- would it make sense to add a versioning counter to the markers? That would simplify which blobber was out of date when comparing two blobbers without needing to get a group consensus.

I'm not familiar with the codebase, so please clarify if I am off in my understanding of any key points.

lpoli commented 3 years ago

As per Whitepaper a write marker contains client id, blobber id, allocation id, timestamp, file root, the hash of the root directory on the file system, prev file root, write counter, and client signature.

Paper has talked about metadata verification in section 8.2: Justification Phase I checked the code and it has verified objectPath(also aliased with merkle path) but other metaData are not verified.

The problem however is that blobber can feed client with inconsistent metaData without failing network challenges. It seems the only solution is to get fileMetaData from all the blobbers and have consensus over the response.

I am not sure about versioning counter as blobber can still act as bad actor by providing correct version number but wrong field values.

We have check against blobber whether it is storing the data but not against blobber denying to serve client with consistent data(Here fileMeta).

Please correct me if I have less understood above issue.

taustin commented 3 years ago

It sounds like we might be able to resolve this issue by adding the fileMetaData (or a hash of it) into the marker.

Any issues with that approach?

lpoli commented 3 years ago

Yes it should work. Any filemeta hash signed by user, stored in reference_objects table will do. For some path, client needs to send signed metadata of all subdirs along the path.

So maybe blobber can impose some kind of limit in directory levels.

guruhubb commented 3 years ago

Isn’t there a check to get consensus metadata from data+1 blobbers to agree ? I remember we went through this before in our /conductor test profile. Check if there is such a corner case in conductor.blobber-1.yaml, blobber-2 and you can run this and see for yourself. Feel free to add new cases.

moldis commented 2 years ago

cc @kushthedude

lpoli commented 2 years ago

@guruhubb I would like to propose one simple solution for this. Currently we don't have consistent datetime values of created_at and updated_at among blobbers. Its better to have same datetime values. Blobber should take this value from writemarkers. We can let client(gosdk) send this value to blobbers.

For the validation of metadata, lets also send signature of client to each refs. So that means we need to add signature column and later on user/shared-user can validate if metadata is authentic or not.

cc @cnlangzi @sculptex