Brendiem / ala-citizenscience

Automatically exported from code.google.com/p/ala-citizenscience
0 stars 1 forks source link

Automate data import into the Atlas of Living Australia #358

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Some of our clients require that data from the BDRS be able to be imported into 
the Atlas of Living Australia.

A brief summary of the required functionality is:
1) Only public/not held data can be exported from the BDRS
2) The administrator should be able to identify the data to be exported on a 
per survey basis.  (i.e. export data collected using survey x, but not survey 
y). 
3) The Atlas will accept the data as a darwin core archive.
4) The data should be able to be exported automatically or manually.
5) An earlier discussion with AJ indicated that there was a use case to be able 
to export data into another BDRS instance as opposed to the Atlas.

Original issue reported on code.google.com by chris.go...@gmail.com on 18 Oct 2012 at 4:49

GoogleCodeExporter commented 9 years ago
There is DarwinCore Archive export but I don't think there's a UI around it:
/webservice/application/downloadDwca.htm in
au.com.gaiaresources.bdrs.controller.webservice.DarwinCoreArchiveService
Under the menu there's:
Admin>Manage Data>Share Data
or
Admin>Manage Data>Download Data
but these are both just placeholder menu items that notify the user that that 
functionality is under development.

Original comment by ke...@gaiaresources.com.au on 18 Oct 2012 at 7:19

GoogleCodeExporter commented 9 years ago
Design Approach:
The proposed approach will be to refactor the index scheduling code to allow it 
to be reused to schedule other tasks.
This will include:
1) Database changes
   - Renaming the INDEX_SCHEDULE table to TASK_SCHEDULE 
   - Repurposing the the CLASS_NAME column to be the class of the Task to be created. 
   - Adding a TASK_ARGS column to contain arguments to the Task Class.

2) Code changes
   - Refactoring a base class out of the AdminDataIndexController as much of this functionality is resuable. 
   - Creating a DataExportScheduleController as a subclass of the refactored class.  
   - Changing the DarwinCoreArchiveService to allow finer control over what is exported (we need survey based control).
   - Write the Task that will be run that will create the DwC archive and copy it to the ALAs upload directory (using sftp).
   - Adding a preference to store the domain name and path to upload the DwC archive to.

3) UI changes
   - A new menu item and pages will be written to allow the export to be scheduled.  It will be similar to the existing data indexing scheduler UI but will have to take into consideration that multiple surveys will need to be scheduled as a single Task (indexing model objects are independent tasks whereas the export needs to produce a single archive).

Original comment by chris.go...@gmail.com on 18 Oct 2012 at 10:25

GoogleCodeExporter commented 9 years ago
(Edit to item 2) Not much point editing the DarwinCoreArchiveService, rather 
the Task would be responsible for selecting the Records to the RecordDwcaWriter.

Original comment by chris.go...@gmail.com on 19 Oct 2012 at 12:05

GoogleCodeExporter commented 9 years ago
Sounds good.

Just to clarify the ALA has webservices to accept darwin core archive formatted 
data. I.e. the BDRS will be pushing to these webservices.

If that's the case, accepted!

Original comment by aaron.lo...@gmail.com on 19 Oct 2012 at 1:53