Edit and maintain Image data

sidkoul commented 11 years ago

How de we maintain images post launch? Some examples:

Re-name a misidentified image
Add a new image
Delete an image
Create a thumbnail
Find all photos taken by an user
Change a copyright
Change the contact info for a user
Lookup permission info
Lookup image source

Image filenames have a naming convention. Check the importer for details, e.g. trillium-recurvatum-le-ahaines-b.jpg

The le designates this as a leaf, and the name ahaines is used at import time to associate the file with the contact info and copyright notes.

The idea is to keep this simple! We have to be careful not to build a large and complex product to solve these problems.

Idea: we might import the copyright holders excel file into postgres as free text fields. We wouldn't be able to issue a structured query, but we'd still be able to track info and it would be centralized.

jnga commented 11 years ago

See also #370, "Botanists need a procedure for adding, removing, editing photos."

sidkoul commented 11 years ago

Thanks for the reminder John. To summarize, we can keep the workflow entirely within the admin, or S3, or a hybrid. I pinged Arthur about this today:

he is fine with working from S3 (I explained it's much like FTP)
it's not critical that changes be visible right away
only staff will be working on species image curation, volunteers will not help with this task
the copyright spreadsheet has a number of free form text fields that aren't included in the data model. They need to be added, so the spreadsheet can be ditched.

Given these requirements, it looks like a hybrid solution is the best fit. Arthur could:

use an S3 client to browse, find, view, inspect, and rename species images
use the admin to edit the data in the copyright spreadsheet
the site would re-index the images nightly

@jnga and @jrrickerson what are your thoughts?

The following list of functions has been vetted by Elizabeth and Arthur:

[ ] View image by name
[ ] View image by thumbnails
[ ] Re-name a misidentified image
[ ] Add a new image
[ ] Delete an image
[ ] Find all photos taken by an user
[ ] Change a copyright
[ ] Change the contact info for a user
[ ] Lookup permission info
[ ] Lookup image source
[ ] Automate server side thumbnail creation

jnga commented 11 years ago

Sounds like a workable plan.

jrrickerson commented 11 years ago

With the small image admin improvement thus far, staff users can now:

Find image by name
Find all photos taken by a user
Reassign a misattributed photo to another user
View image by name

sidkoul commented 11 years ago

That looks good!

jrrickerson commented 11 years ago

While one can technically upload a new image for a given ContentImage via the Django admin, because we're parsing a lot of information from the actual S3 filenames when we scan images, the most sensible and simple way to handle image renaming, new image upload, and image deletion is likely to be a direct S3 client. This offers the advantage of being able to perform these operations in bulk as well, if necessary.

Please review / evaluate the following S3 GUI clients to perform these tasks. Each of these has a free offering - if you wish to examine licensed products as well I can research a few more. S3 Browser (Windows Only): http://s3browser.com/ CloudBerry (Windows Only): http://www.cloudberrylab.com/free-amazon-s3-explorer-cloudfront-IAM.aspx Cyberduck (Mac and Windows): http://cyberduck.ch/ S3 Fox (Firefox extension): http://www.s3fox.net/

sidkoul commented 11 years ago

I'll have a look at those clients JR. So what's left now, figuring out how we sync the changes made via the S3 client to the website, and adding the un-included columns from the copyright spreadsheet to the database?

jrrickerson commented 11 years ago

I believe so. We should be able to do the syncing just by running a portion of the "load" script that does the S3 scanning and so forth, without having to run the entire script (the one that currently obliterates the database). If we just separate out that piece we should be able to just schedule a job to run that nightly - any new or renamed images I think should be automatically updated.

As for the Copyright information, I asked John a couple of quick questions about it and I'm proceeding to look at how we can migrate that more fully into the database and move away from the spreadsheet. Once I've added that to the admin I should be able to add a simple link from the ContentImage admin over to the copyright admin for the copyright information of the matching photographer.

If we want to get more complex, where the same photographer could have multiple copyrights for different images, we'll need to make some additional model changes to accommodate that.

sidkoul commented 11 years ago

A single coded name (e.g. ahaines) in the filename maps to a single copyright. If there are were multiple copyrights for the same photographer, the Botanical Data Specialist added a number to the coded name (e.g. ahaines2) and another to the copyright spreadsheet.

newfs / gobotany-app

Edit and maintain Image data #493