Kinetic / kinetic-protocol

34 stars 21 forks source link

Data checksum #23

Open ottuzzi opened 9 years ago

ottuzzi commented 9 years ago

Hi,

it would be interesting if I can ask the target drive to return a checksum of written data but I do not see this possibility in the protocol: am I missing some detail? I would like to be sure that if I asked to write some data these are really written and read as intended by the target disk: what I'm thinking is some new call to write data and, contextually, to return the checksummed value of what it was written to the disk. The returned value can be checked with the "host" own value so we can have a good probability everything is fine. What do you think?

Thanks Bye Piero

jphughes commented 9 years ago

At this time we do not return the checksum of the data to be written, but we will check the value that is sent along with the data.

The way that it works now is:

Since the tag is set before sending, you can be assured that there is a complete end-2-end data integrity, If the drive calculated this, there is the risk of a data integrity failure between the host and the drive (i.e, TCP is not perfect, and TCP error detection is not perfect either) the drive would return the checksum of the wrong information.

Hope this helps

Jim

On Jan 12, 2015, at 2:15 AM, Piero Ottuzzi notifications@github.com wrote:

Hi,

it would be interesting if I can ask the target drive to return a checksum of written data but I do not see this possibility in the protocol: am I missing some detail? I would like to be sure that if I asked to write some data these are really written and read as intended by the target disk: what I'm thinking is some new call to write data and, contextually, to return the checksummed value of what it was written to the disk. The returned value can be checked with the "host" own value so we can have a good probability everything is fine. What do you think?

Thanks Bye Piero

— Reply to this email directly or view it on GitHub https://github.com/Seagate/kinetic-protocol/issues/23.

ottuzzi commented 9 years ago

Hi,

thank you very much for your answers: everything you say is clear but I was thinking to a more thorough check. I'll try to show the differences between what I understood is implemented at this moment and what I was thinking about.

NOW

MY PROPOSAL

In my proposal you keep the same behaviour but I'm asking to add a new workflow working this way:

The whole point here is to avoid a subtle disk error: in your workflow last check is in data arrival to disk frontend, in my proposal last check is about data written on disk. It can happen that data written to disk cannot be read correctly. With your approach you will know data cannot be read correctly on next read (probably when you need them), in my proposed additional workflow you know immediately that data can be read... at least for now ;)

Hope I was more clear than in first post :)

Thanks in advance Bye Piero