underlay / underlay.org

https://www.underlay.org
GNU General Public License v2.0
23 stars 5 forks source link

update: Full pass at integration and cleanup of Collection flow #69

Closed isTravis closed 2 years ago

isTravis commented 2 years ago

This will likely be a long-lived (1-2 weeks) and uncharacteristically large PR. The project is at a state where nearly all pieces are working, but there is not a cohesive model and process for bringing those pieces together. I intend to do that in this PR. This will require several database migrations, code refactors, and UI updates.

In order to not cause conflicts on the dev branch while I much such migrations and refactors, I plan on keeping changes here until the full integration is updated and ready. I'll keep a rough task list at the bottom of this to keep folks appraised of progress, but it's just my notes and is subject to change - it's not an exhaustive or exact set of requirements/features. I'll also include descriptions of larger changes and refactors - those will likely be easier to follow than the large number of file changes.

Schemas

Schemas are currently a single field on the Collection model. In order to keep track of the history of schema edits, they need to be their own model. I'll be creating a Schema table and connecting its items to collections.

Uploads vs Publishing

One thing I noticed was misaligned was our model of a file upload being equated to a new published version. I expect folks will often have several files, or a file + manual edits, or some other combination of input that they want to make before publishing a new version. Essentially, there is a draft version that collects edits until the user is ready to publish. This requires us to track edits/uploads separately from versions (tracking this is also the basis for provenance).

In modeling this, I landed on having two types of thing:

  1. Inputs which represent a specific contribution to the dataset, by a single person, at a single time, from a single source. We can have a single Items table that holds all of these and is connected to a collectionId.
  2. Input Sources which represent a specific form-factor of input (e.g. CSV upload, JSON upload, Web UI edit, API request payload, etc). Each Input has a foreign key to a single InputSource. We can have many types of Input Sources (each their own table) which normalize the data about that type of input source. For example:
    
    InputSourceCSV
    uploadedFileUri
    mapping
    createdAt

InputSourceAPI sourceIpAddress payload


It was helpful for me to model the flow for a CSV upload as guidance:
<img width="1684" alt="Underlay CSV Flow" src="https://user-images.githubusercontent.com/1000455/166744635-c2f6c063-32de-4551-b396-be477898f32b.png">

## Running task list
As noted above, this is mostly for my notes, just an exhaustive or exact spec-list/roadmap. I'll update it each day based on the progress I've made.

- [x] Clean the Schema edit flow
  - [x] Create `Schema` objects on save
  - [x] Fix relationship bug
  - [x] Default to static viewer with Edit button if there is already a schema
  - [x] Build SchemaViewer for static version
  - [x] Block editing if there is data - we need to integrate tasl migrations for this to work properly.
  - [x] Clean up components
  - [x] Update CSS
- [x] Refactor upload popover
  - [x] Add nested design to schema-alignment component
  - [x] Have 'complete' button generate `Input` and `InputSource` objects. Create the backend space for doing future 
  - [x] Process input into stored data file used for rendering.
  - [x] Add reductionType options
- [x] Load data from generated file (e.g. processed draft or version file)
- [x] Allow version switches with dropdown
- [ ] Build structure for making Data queries/selections. It'll all be client at the moment, but eventually this will be where API calls go.
- [ ] Visualize Inputs
- [ ] Add button on entities display provenance viewer (just shows related `Inputs`).
- [x] Build Publish button and flow
- [ ] Build JSON export flow
  - [x] Build alignment/selection tool
  - [x] Generate cached file and create `Export` object with proper links, etc
  - [x] Figure out how to get exports to auto-generate on version update
  - [ ] Update Export Table to show real values, etc
  - [x] Update CSS
- [ ] Update Overview page to properly hook into versions, schemas, etc
  - [ ] update getting started tab, remove schema flat iamge
- [x] Make collection slugs have permanent suffix, this will be useful for routing export caches and in case collection/namespaces titles change
- [ ] Add discussions
  - [ ] UI for creating
  - [ ] Visualize on entity
- [ ] Improve collection preview design
- [x] Make `unique` field singular and `uniqueIdentifier`
- [ ] Implement settings pages content
- [ ] Improve collection header design
- [ ] Update landing page with improved language
vercel[bot] commented 2 years ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
underlay-org ❌ Failed (Inspect) Jun 13, 2022 at 3:32PM (UTC)