geogeeks-au / maps-for-lost-towns

Georeferencing historic maps through crowdsourcing
4 stars 1 forks source link

Knit together SRO's digital objects with our copies #11

Closed keithamoss closed 8 years ago

keithamoss commented 8 years ago

Now let's stitch the two together so we can see if the files we have can be associated with SRO's digital objects collection

Dependencies: #10 Next: #9

Process

keithamoss commented 8 years ago

What else have they stored in the JPEG metadata?

At least resolution (PPI), but what else?!

keithamoss commented 8 years ago

Use SQLite/PostgreSQL rather than JSON to make querying easier.

keithamoss commented 8 years ago

Need to exclude fieldbooks

keithamoss commented 8 years ago

6090 rows matching

SELECT fields->'filename' as do_filename, fields->'filesize' as do_filesize, img.filename, img.filesize, img.id as img_id FROM sro_digital_objects_collection AS "do", sro_images AS "img" WHERE fields->>'filename' = replace(img.filename, ' ', '_')

655 DOs with no match

SELECT fields->'filename' as do_filename FROM sro_digital_objects_collection AS "do" WHERE NOT EXISTS 
(SELECT replace(filename, ' ', '_') FROM sro_images WHERE fields->>'filename' = replace(filename, ' ', '_'))
keithamoss commented 8 years ago

From a quick comparison of file sizes it looks like they match pretty closely. calling this done!

keithamoss commented 8 years ago
SELECT MIN(width), MIN(height), MAX(width), MAX(height), AVG(width), AVG(height), MEDIAN(width), MEDIAN(height) FROM sro_images

https://wiki.postgresql.org/wiki/Aggregate_Median

screen shot 2016-06-09 at 19 08 59