alanshaw / ipfs-only-hash

#️⃣ Just enough code to calculate the IPFS hash for some data
MIT License
135 stars 28 forks source link

specs of the operation #13

Open kropple opened 3 years ago

kropple commented 3 years ago

hi.

i want recreate ipfs-hashing in php

but i can nowhere find a concrete specification of how this works step by step.

my goal is to be able to hash a file with php without any installed ipfs components.

thanks for any pointers

titusz commented 2 years ago

I tried with python ... and failed miserably. The problem in my case seemed to be that pythons protobuf serialization was not byte compatible with those from the ipfs-go implementation. A concrete spec or better a straightforward single module implementation in any programming language would be helpful.

titusz commented 2 years ago

This will print out correct leave hashes. If your file is smaller than 262144 bytes you will get an actual working "default" CIDv1:

"""IPFS CIDv1 Hashing"""
from hashlib import sha256
import base64

CHUNKSIZE = 262144

chunk_cid_header = (
    b"\x01"  # multicodec cidv1
    b"\x55"  # leave format raw
    b"\x12"  # hash function sha-256
    b"\x20"  # len-32-bytes uvarint encoded
)

def base32(data: bytes) -> str:
    return base64.b32encode(data).decode("ascii").rstrip("=").lower()

def ipfs_hash(path: str):
    with open(p, 'rb') as infile:
        data = infile.read(CHUNKSIZE)
        while data:
            digest = sha256(data).digest()
            chunk_cid = 'b' + base32(chunk_cid_header + digest)
            print(f'{chunk_cid} {len(data)}')
            data = infile.read(CHUNKSIZE)

The trouble starts if you need to build Unixfs PBNode(s) for larger files ...