Think through the database schemas

The processing allows for source and target tables.  The target can be a 
relatively empty database, and have the main tables populated on sqoop out.  
There is no need to backup these tables, or copy them in.

The task is therefore to think through each table and try and optimize the 
processing, to not have duplicate copies of things unnecessarily in multiple 
databases (disk space is an issue).

This includes the DB that HIT uses.

Expect to have:
LIVE DB: Full schema populated
HIT DB: Partial schema only
Target processing DB: Full schema, semi populated from HIT / LIVE (think this 
through), then finished by rollover
Registry DB: Can it hold the gbif_log_message onto which the portal always 
connects?

Original issue reported on code.google.com by timrobertson100 on 22 Apr 2011 at 6:12

srvarey / gbif-occurrencestore

Think through the database schemas #14