Open mottosso opened 7 years ago
After a close looking, I find the development of C4 framework seems in hibernate, maybe even outdated.
Here's some links to know C4 framework C4 Framework C4 Language doc
As for the C4 Asset ID's implementation, it's simply hashing file, one can easily implement with Python's hashlib
, but to get ride of chars that makes hash string non-double-click-selectable like +
-
/
=
, indeed require some tricks to do that, but I think it's not a Must, because in my imaging, I don't see one would need to copy-paste hash string by hand very often.
They have PyC4, but outdated, seems not act the same with c4's Go implementation.
I think we could use simple file hashing to verify asset's integrity, while downloading / loading asset representation, or checking out source asset's modification before publish (like checking textures when publishing LookDev).
But comparing two files with simple hash string can only tells you that they are identical or not.
While googling file hashing knowledge, I found similarity hash. I think this is much useful for us, it not only tells you two files are the same or not, but also tells you how they are alike !
Here's two interesting repo I found: imagehash python-hashes
Did not look deep / test in those two yet, but I think this could lead us to better life.
Thanks for the detailed follow-up @davidlatwe
After a few more tests on Similarity Hash, I find that I miss understand the use case of it. (facepalm)
It's more useful on Searching (obviously), so I think, unless we are going to build a set-dressing library or other kind of texture/matte-paint database, Similarity Hash can be ignored. At least it's not fit into publish process which strict modification comparing is required (Similarity Hash can provide strict comparing, but need more calculation).
On the use case of preventing content create duplication, beside commonly used image format, some major 3d asset exchange format .obj
.fbx
.abc
may not benefit from hashing because it saved with timestamp
, which leads to re-exporting same content makes different hashing result. But .usd
may work since it does not embedded such metadata (in my recall).
On the perspective of content validation, no matter production involve cloud storage or not, we can all gain some extra long-term security from hashing asset file in every publish process
If we only work locally, then it's a quick answer, No. hashlib
is much more convenient since we all use Python.
If we need to exchange file cross studios or sites, I think the answer depends on the development of C4 framework. One major issue need to be addressed would be how we hash directory, unless we all sharing with .zip
.
Currently, the directory hashing behavior of C4, will ignore duplicated files and generate same IDs for each directory, this may not the way we want, since there might have same file sequence but different length.
No matter using C4 or not, if we are going to work on cloud or any form of asset exchange, I think we need some one to regulate how we hash nested asset representation files / directory.
Hey @mottosso, thanks :) ( I should type the second post faster :P )
Evaluate whether C4 is relevant to us.