iscc / iscc-specs

ISCC: International Standard Content Code
http://iscc.codes
Other
47 stars 9 forks source link

content specific IDs #92

Open livinter opened 3 years ago

livinter commented 3 years ago

[alert: typical newbie issue: :-) ] [short: define the content by data and not by code]

I think there are several issues with the content ID defined by code.

Possible solution: Each bit of the content ID is defined by an attribute that has a name. examples may be [scientific, funny, violent, animal, social,....

For each name/category/atribut example data is provided. if tests are written then the only requirement Is that a minimum of X bits are correct. the detection or training of the data also shold take care that each possibility of a bit is used 50%. to get a uniform has distribution.

advantages: more freedom in writing individual content similarity match. flexible for updates. different contenttypes can be matched. more easy to specify.

livinter commented 3 years ago

I realized it may be difficult to get all media to this human level of understanding maybe even impossible when the media doesn't contain human-level content. (a video with a graphic effect.)

Besides someone could also ask, WHAT level of comprehension? (... in space and time.)

It can be solved by adding an extra number/id that specifies how deep the analysis has been.

  1. raw-data Each bit would be specified by specific media (one for each media type to be specific) For example whatever the algo does, but when picture number 10 Is presented bit number 10 must be set.

  2. concrete - human-understandable objects. [building, elephant, trees, human, stone, big-face,...,] only very frequent objects should be on the list to get each bit activated 50% of a time. same here: other bits that MAY be activated...

  3. properties, including abstract once [social, violent, natural, NSFW,....]

Supplying sample data could help to align the different algos better together without being part of the specification. in case of 1) raw-data, images would be rotated, scaled,... text; characters could be deleted, words moved...

in the case of 2,) collection of media could be supplied, if we want an elephant-picture, would different elephants in text, this could include multiple languages.

For the existing pHash algo one xor value can be calculated to comply with the specification.

contra: implementation in raw data would be specified very little.

pro: depending on the use case it is up to the user how deep he wants his content to be analyzed: 1) raw-data-based, 2) object-based, 3) theme-based this can be desired as some video hash algos can be slow (on specific hardware)

the aim would be to have a specification that does not need to be changed while being open to getting the content matching always up-to-date.

P.S:Reference media need to be in a lose-less format. Sample/training data does not.