EOL / deprecated_eol_php_code

Encyclopedia of Life
http://eol.org/
Other
5 stars 7 forks source link

Connector to/from Zooniverse projects #147

Open jhammock opened 8 years ago

jhammock commented 8 years ago

For Eli and Jen. People with project management access will find these content partners at https://www.zooniverse.org/lab/ (@eliagbayani @mariestuder @KatjaSchulz; anyone else who needs access, send @jhammock your zooniverse username)

We will want to port images from EOL into the zooniverse project (more investigation needed for supported formats) and tabulated data from zooniverse into TraitBank (CSV exports available for testing in the "Data Exports" tab). Images will usually be organized on EOL either in a Collection (a partner resource collection) and/or under a single higher taxon. Ideally, filtering by both of these things would be great. eg: fetch images from http://eol.org/collections/246 that are children of http://eol.org/pages/2366. Jen wonders if it is possible to tap into whatever provides resource-specific lists of data objects to the Curator Worklist. But it may be safer to use the collections API and filter in a later step.

Data coming out of Zooniverse should ideally get a significant amount of processing. Two kinds of measurements: -Body length, width: distances measured on the screen. (pixel positions are provided in the output; distance would need to be calibrated against a scale bar measurement with a length label annotation) -Coloration: pixel positions and image files. If extracting the pixel color data is practical, Jen can provide a mapping to color values. We can back burner this one if it's prohibitively tricky, or computationally too expensive.

Each trait record should be summarized from several zooniverse records. A small number of volunteers will annotate each image (replicates), and their responses combined for one record per image. eg: a mean and standard deviation would be calculated for a body length measurement; the contributor ID for each replicate would be added as an agent to the record, and if possible, certain fields should be compared and the record discarded if they are not identical, for instance, the scale bar label annotation.

jhammock commented 8 years ago

@eliagbayani lots to think about; when capacity permits, let me know where you want to start! Thanks :)

eliagbayani commented 8 years ago

Commenting to acknowledge and to become a participant in this thread. @jhammock since this is dependent of EoL harvesting, I was thinking of let us start with it once general harvesting is working again. This way we can test any initial steps we'll have. Just FYI, at the moment I'm closely working at the Student Contribution Wiki and exploring MediaWiki use on the BHL-EoL data pipeline. Thanks.

jhammock commented 8 years ago

There is no urgency on this. After harvesting resumes sounds good!