open-AIMS / ADRIA_matlab

Repository for the development of ADRIA: Adaptive Dynamic Reef Intervention Algorithms. ADRIA is a multi-criteria decision support tool set particularly useful for informing reef restoration and adaptation interventions.
1 stars 0 forks source link

Use standardized data package format? #145

Open ConnectedSystems opened 2 years ago

ConnectedSystems commented 2 years ago

We currently have a fairly well standardized data package, defined by nested directories in the form of:

├───Site Name
│   ├───connectivity
│   │   ├───2015   # connectivity years
│   │   ├───2016
│   │   ├───2017
│   │   └───2019
│   ├───DHWs
│   ├───site_data  # spatial data
│   └───waves

It's just lacking a formal spec in the form of datapackage.json file (see here). Although I've attempted to maintain an accompanying readme.md, this needs to be updated/cleaned up as well.

Due to legacy design choices, we're currently pointing to individual files/folder locations for connectivity, DHW, wave and site data, that happens to be in the same folder location as the project repository. This is not a viable long-term solution given

  1. our planned moves to a distributed app and/or dashboard which more than likely includes interacting with the Amazon AWS-based system developed by the IS team.
  2. the likely increased file sizes and commensurate burden on git

Suggest we refactor things to instead accept a single arbitrary location that points to the data package, instead of the ~three~ four separate entries currently required. This single location could be a entry point, network drive, or other. Relevant data can then be loaded following the defined structure above.

@Rosejoycrocker could you voice your opinion/concerns/approval?

Rosejoycrocker commented 2 years ago

Hi @ConnectedSystems , sorry for the delay I just saw this. I think this is a great idea, particularly if we want the interface to handle arbitrary clusters/reef sites. It would also be great if ADRIA() could be loaded with a single input location of the data package, rather than entering each of the data types separately.