Proposed new design for /datasets API: Dataset metadata is still stored in json metadata in S3, but generated by separate github repo

Background: Right now /v1/datasets returns datasets generated by the dataset metadata generator and stored on S3 (e.g. dev-dataset-metadata.json). What does the metadata lambda do now?

generates a temporal domain, list of dates that are valid for this particular dataset
does something similar for "sites"

Goals:

Users can PR new or updated datasets and have them automatically picked up by the datasets API (when merged to a main branch).
Users can have a mosaicjson endpoint to visualize their dataset / data collection.

Problem with this approach:

If people want to add new layers to the dashboard, they would still need to open PR and have it reviewed and approved.
Alternatives: people can POST new datasets to the dataset API (but these could not work)

Acceptance criteria:

<env>-dataset-metadata.json stored on S3 is updated whenever PR is merged to new dashboard-datasets-starter repo
Config files PR'd to this new repo can include STAC API URL and query parameters to generate a mosaic. The lambda will generate the mosaic endpoint and include that endpoint in the <env>-dataset-metadata.json
For MAAP: user can create a PR to dashboard-datasets-maap for data with existing tiles endpoint and it will add layer to dashboard (once merged)
For MAAP: user can create a PR to dashboard-datasets-maap for SRTM data mosaic (using STAC API and query parameters) to add SRTM layer to dashboard (once merged)

Proposed solution:

MAAP viz + dashboard use case design

Tasks:

[x] Create new repo in NASA-IMPACT “dashboard-datasets(-starter)” and reference / reuse code from dashboard-api-starter dataset metadata generator lambda for generating metadata via a lambda function and storing it on S3. This code could be copy/paste but acceptance criteria for this first task is just to take the existing dataset config files(s) (e.g. for MODIS) stored in the same repo and updates the S3 metadata file. S3 bucket location should configurable.
- Lambda code: (Code will should also be removed from) https://github.com/NASA-IMPACT/dashboard-api-starter/blob/main/lambda/dataset_metadata_generator
- https://github.com/NASA-IMPACT/dashboard-api-starter/blob/main/dashboard_api/db/static/
[ ] Design what a revised config file should look like (e.g. one that can use STAC API and titiler endpoints)
[ ] Include in new repo github workflow to trigger the lambda whenever a dataset is updated or created to the main branch
[ ] Metadata generator lambda generates mosaic(s), creates or updates dataset json metadata and updates updates dataset json on S3

Improvements:

[ ] Github linting of new updated dataset config files based off some basic checks

Questions:

Can we remove all the /sites code for now and re-implement when requested?
Can we assume the generation of the temporal domain (e.g. this function https://github.com/NASA-IMPACT/dashboard-api-starter/blob/main/lambda/dataset_metadata_generator/src/main.py#L209) can and should still work?
Is it the right approach to have the lambda call the POST /mosaic or will users always do that themselves and then PR a config file with the mosaic URL already defined?
For MAAP, we might want to restrict datasets which can be visualized to those published in CMR. How can we quality control datasets?

NASA-IMPACT / dashboard-api-starter

Proposed new design for /datasets API: Dataset metadata is still stored in json metadata in S3, but generated by separate github repo #3