Automatically extract data from assets

aembler commented 4 years ago

Assets uploaded to BrandCentral frequently have good EXIF metadata. Let's extract that data into the name and description (and anything else we can easily do).

In c5 version 9, add new options to this checkbox list in the dashboard.

You can find these options on /dashboard/system/files/image_uploading

If these options are checked, use an image processor to look into EXIF metadata when files are uploaded and automatically populate name and description and keywords with this data. If there is other easily accessible data that might be useful, let me know what you find and we can figure out whether we want to add some more custom attributes specifically into brandcentral that might track this stuff.
Make sure that this asset metadata (which is found at the asset file level) can make its way up to the asset somehow?

This is mostly a core change but since it's tied to delivering some updates to BrandCentral I thought I would add it here.

bitterdev commented 4 years ago

Should i implement a checkbox for each option like "Use EXIF metadata and extract the keywords" etc. or should i implement one global checkbox like "Extract EXIF metadata" ?
There are much more options. Most values are relevant only for photographers but i would suggest anyway to fetch them all and create attributes for each value. Things like that is what concrete5 makes so wonderful and what this is what the community loves.
For server-side implementation i would suggest to use this library.. This is MIT licensed.

Because of the further implementation in brand central:

Currently there is an API method which retrieves image informations with Google Vision. Should i extend the processImage method and return also all exif values as new tags?

aembler commented 4 years ago

Yes, let's add new checkboxes right now for name, description and tags. Populate them on the file object.
I don't want to have to make a bunch of file attribnutes, but maybe we could add a fourth option that is "Populate Additional EXIF Metadata if available?" And then the routine could look for any custom attributes for files that match "exif_{optionName}" as their handle, and then auto populate? So if someone checks that checkbox and creates a file attribute with the handle "exif_copyright" it will automatically look in the "Copyright" field for that data and populate the attribute? Does that make sense? Then admins could create whatever EXIF attributes they really cared about and the routine would auto populate them.
If this library looks good to you we can use it. We do however use Imagine\Image\Metadata\ExifMetadataReader already. Is there a reason we wouldn't use that?

Finally, for BrandCentral yes, I would love to be able to take these tags and also add them to the brandcentral tags API on asset create like the google vision routine does.

bitterdev commented 4 years ago

Okay great. Because of 2: I think this is hard to use when admins needs to manually create the attributes because when the the attributes not have the exact handle it won't match. So what do you think about creating the attributes programmatically when they not already exists? I think that's it'a a great solution. Because of 3: Okay great didn't noted that. I will use the core-integrated reader.

aembler commented 4 years ago

If you can group all of these attributes into their own File Attribute Set, then I think that's ok. I just don't want to have a million attributes if users only care about one or two (like you said, there are a lot of EXIF attributes.)

bitterdev commented 4 years ago

Okay i will create a group set and add all attributes into this set.

concretecms / brand_central

Automatically extract data from assets #29