Closed heathercole closed 3 years ago
Can we get a final decision on whose responsibility it is to process assets? I had no intention of converting CR2 to tiff as part of a bulk ingestion routine because in effect, this entails the submission of a derivative purported to be the original & is subject to the vagaries of whatever versions of image conversion software is employed in a script. If this is unified at a single point (post-submission) then regeneration of derivatives can be executed as necessary & checksums better managed.
the file-type conversion of conveyor images from .cr2 to a non-proprietary file-type has been part of the processing requirements from the very beginning (including rotation into profile-orientation). James committed to fulfilling these requirements, therefore they were removed from the conveyor-vendor deliverables. I understand that the script Jackson piloted addressed this requirement. The assets need to be available to clients (including the public) in an accessible (non-proprietary) format
@heathercole Yes, not denying that. The issue is where and when the conversion should be done, ensuring file names are reference-able through the metadata that are generated once the assets are ingested & files renamed on the server as part of the opaque ingestion process. I assume you have no intention of wiping out the .cr2 files once tiffs are generated, and it would be dangerous to do so; it's near impossible to safeguard against the occasional messed-up conversion from .cr2 to tiff that might happen. And so, we must have the capacity to regenerate tiffs from the .cr2 when required without having to recreate a new asset record with its complement of metadata. It would be much more efficient to do this from the multimedia module because you can see that a conversion did not work as intended & there is already a handle to the original asset.
What I'm ultimately getting at here is when we see that an asset is corrupt, has a glitch, whatever and that apparently resides in the tiff as a result of the conversion from .cr2 to tiff, what then?
when clients request images, the responsibility can't be on collection staff to implement any conversions, as they may not have access to relevant software tools. If the internal and external media server/interface supports on-demand conversion/creation of relevant derivatives whenever required, then the requirement for the media module would be to display the cr2 files in the correct (portrait) orientation.
While I agree with the principles you describe regarding originals and derivatives, the reality is that there would be no need to maintain proprietary file-type originals if appropriate loss-less converted files were created. Certainly conversion scripts would need to be well trusted and appropriate checks in place.
I understand that there are different ways for the requirements to be met, but we need to be sure they aren't falling through the cracks between "issues". If I assume that requirements are being met as documented on the conveyor script, then I am not creating additional "issues" related to them. If you are changing the deliverables on the conveyor processing script, then it needs to be clear where/who will be taking responsibility for the related requirements.
I'll ask this another way as it's (somewhat) related. Do you want the ability to fetch or at least find the .cr2 from the multimedia module when the tiff appears corrupted & evidently needs to be regenerated? And then once found, would you not then want to "re-upload" it without having to recreate all the metadata that were previously generated only through the context of a rather hairy set of scripts?
Thanks for your insight on this @dshorthouse I will review discuss with @shannonasencio , as she will ultimately be the curator of the assets . I will present the requirements and then WP3 can evaluate most effective solutions.
Done by combination of harvester and derivatives.
i have requested a demo of this several times to ensure the requirements are met. At the moment, when I upload a cr2 into the media module there is no image preview shown and no current way to search on the barcode from images off the conveyor.
300k+ metadata.yml sidecars were completed yesterday along with a 70MB csv export. I'll leave it to @cgendreau to trigger a sample import using the harvester for your perusal & @ssbilkhu to churn the csv export into Excel with clickable links.
That sounds good for the harvester/automated import. However, this requirement also relates to the manual process, which currently doesn't display or preview cr2 files uploaded/saved by a user.
We will see later how we can handle that but it won't do it automatically short term for manual upload. We are currently working on the on-demand conversion so after that we will just need to connect the 2 together.
I was clicking around trying to recreate a view distortion I have viewed a few times, and noticed that the image is NOT displaying for this record.
http://dina-ui-template-dinaui.apps.biodiversity.agr.gc.ca/object-store/object/view?id=6f248422-0c70-4e05-ab1f-3a303cf6eea8
maybe it relates to the fact that this is a CR2, canon proprietary format. Currently, the high-quality version of most DAO images is in a .cr2 format. This is not ideal, and not intended to be the long-term format. These will need to be converted into a loss-less non-proprietary file type.
If conversion support is not implemented when images are imported, then they should be displayed, but if we can figure out conversion support ahead of time (eg. tiff or similar alternative), then perhaps no support for .cr2 required at all in the module.
may relate be able to be tied to the conveyor image processing workflow that @jmacklin and @dshorthouse are working on, as part of that requires a similar conversion.