The processing allows for source and target tables. The target can be a
relatively empty database, and have the main tables populated on sqoop out.
There is no need to backup these tables, or copy them in.
The task is therefore to think through each table and try and optimize the
processing, to not have duplicate copies of things unnecessarily in multiple
databases (disk space is an issue).
This includes the DB that HIT uses.
Expect to have:
LIVE DB: Full schema populated
HIT DB: Partial schema only
Target processing DB: Full schema, semi populated from HIT / LIVE (think this
through), then finished by rollover
Registry DB: Can it hold the gbif_log_message onto which the portal always
connects?
Original issue reported on code.google.com by timrobertson100 on 22 Apr 2011 at 6:12
Original issue reported on code.google.com by
timrobertson100
on 22 Apr 2011 at 6:12