Added the proposal for the new data model.

weegeekps commented 1 year ago

As discussed earlier today, here is the proposal for the new data model, brought over from the gist.

weegeekps commented 1 year ago

This was originally posted by @javagl

I started drafting that in https://github.com/javagl/glTF-Project-Explorer/tree/data-model-v2

There is no PR for that yet, but that can be opened at any time, together with

The information from this gist (if this OK for you)

The following summary of changes until now:

The changes are

The IProjectInfo has been cleaned up: The legacy fields have been removed, and it now contains a tags record

(There is a ILegacyProjectInfo that is read from the file, but immediately translated to the new structure in the DataService.ts)

There is a IProjectsMetadata.ts that summarizes the metadata. It just maps the fixed properties and tags to IValueType objects that contain the type/isArray from the above draft.

The IProjectsMetadata.ts still contains the ProjectTags/ProjectFilterTags that have been part of the first generalization pass. These should not be "static" exports, but part of an IProjectsMetadata instance (see notes below)

The DataService offers a method to read an IProjectsMetadata from a file

The IProjectsMetadata is not yet really used. It will have to contain more information, and will have to be examined in order to build the UI. That instance will have to be passed around a little, and I did not yet think much about the right place of where this > instance should reside. Probably, that instance should directly be in the IAppState, right?

One piece of information that may have to be added to the metadata is something like a "UI name" of the tags, as in
      "inputs": {
        "type": "string",
        "isArray": true,
        "description": "The supported input file formats"
      },
Other things that are still missing refer to this:

How do we define which fields can be used for searching, and which for filtering? ... ... and filtering on the "tags"

The filtering currently does not happen on all tags - for example, there are no filter options for the inputs/outputs. This is exactly the difference between the ProjectTags and ProjectFilterTags (the latter do not include inputs and outputs).

We could throw in a straightforward solution, and say
      "task": {
        "type": "string",
        "isArray": true,
        "isFilterTag": true
      },
      ...
      "inputs": {
        "type": "string",
        "isArray": true
        "isFilterTag": false 
      },
but that's not enough: We have to define the "tags that can be filtered by" in a form that defines the order of the filter selection boxes in the UI. (We could rely on the order in the metadata JSON, but... yeah, well, let's see...)

For the full text search:

A flag like
      "inputs": {
        "type": "string",
        "isArray": true
        "isIndexedForFullTextSearch": true
      },
could be easier. But.... we should probably think about whether there will ever be a case where we want to build two indices - like two text fields "Seach in names" and "Search in descriptions" or so. That may sound far-fetched, but ... If we want to offer an > option to search, for example, in the inputs, then someone might want to enter STEP as the input file format, and not receive a list of projects where the description says that the project is a "Huge step forward"...

One could brainstorm here, like
{
  thereShouldBeIndicesFor: [
    ["name", "description"],
    ["inputs", "outputs"],   
  ],
}
but that can be sorted out after the other questions have been tackled.

weegeekps commented 1 year ago

So, I'm actually seriously considering we potentially get rid of the separate filter fields and move to a "super field" that can accept full text searching plus some filtering potential based on any tag with a bit of autocomplete.

Effectively, you could have a query like:

dan's great gltf creator [outputs:glTF 2.0]

Pure text would be an entire full-text and the the tags would be a div elements, represented by the brackets. By typing glTF you'd get an autocomplete box that shows the potential options for tag filters you can apply. You can either keep typing, or use the arrow keys/mouse to select the tag you want to filter on.

I can mock something up while I'm on holiday. Stay tuned for more.

javagl commented 1 year ago

So, I'm actually seriously considering we potentially get rid of the separate filter fields and move to a "super field" that can accept full text searching plus some filtering potential based on any tag with a bit of autocomplete.

I thought about that as well - similar to searches like in the GitHub PRs, which by default are is:pr is:open, but allow further constraints like author:javagl. But we'd have to think about how to make this accessible. We don't want to force new users to type in things like converter [inputs:STEP, ...how to list?] [outputs:glTF 2.0] [type:application] into a text field.

Some high-level brainstorming thoughts:

There could still be the default full-text search only for title and description
There could be some sort of "advanced" search where the user can type in these [outputs:. ] queries
That approach implies that we define some sort of (small) "query language", eventually
The clickable filter fields in the UI could remain as they are, but "under the hood", they would cause the proper query string to be constructed
We have to think about AND vs. OR, and things like this make defining such a query (cleanly, formally) more difficult...
Bonus: If it is possible to attach this as an actual query string to the URL, it would be possible to share URLs with "pre-configured searches" like [All C++ projects](http://example.com?query=[language:C++]).

(There's still that task on my plate to wrap up that local experiment, where I tried to filter the lists of available filter tags, so that it's not possible to click a filter that causes the list to become empty. From a UX perspective, that's awkward, to say the least. But maybe we should try to focus on the POC, keeping future features in mind, but tracking the ideas around that in other issues)

javagl commented 1 year ago

Not really substantial progress here.

However, one aspect from my first (quoted) comment here was

We have to define the "tags that can be filtered by" in a form that defines the order of the filter selection boxes in the UI.

and in the latest pass, I added this in the metadata JSON as

"filterTags": ["tags", "task", "type", "language", "license"]

This is probably preliminary. On the one hand, the question of whether something can be used for filtering is part of the data model (we cannot use these filters for free-text fields ... and they wouldn't make much sense for a date either). On the other hand, the order of the filter UI components is solely in the responsbility of the UI-creator.

So I think that these things could reasonably be part of a "config file" that is independent of the metadata, and that may carry ~"all the stuff that is relevant only for the UI". This file might later also include stuff like color schemes or whatnot.

weegeekps commented 1 year ago

These answers are all over the place. Weekend hasn't gone as I had hoped it would so I didn't make as much progress today as I had hoped. Going to spend a bit more time tomorrow evening to wrap up the mockups of the "super-search". I'll also get #168 reviewed fully in the morning.

I think those five tags are good for the first pass.

So I think that these things could reasonably be part of a "config file" that is independent of the metadata, and that may carry ~"all the stuff that is relevant only for the UI".

I'd honestly like to avoid this if possible, and that's part of the reason for having the metadata block. I think it's reasonable for the metadata block to provide the relevant information for the UI, including order, descriptive text, and whatever else we need. It doesn't need to support all of this up-front. I realize that is flying awfully close to almost a CMS amount of complexity, but I think that's one of the things that CMS solutions tend to do right more often than not. I see this approach as being particularly advantageous because it would allow us to have a single code-base.

the question of whether something can be used for filtering is part of the data model

This is a big question, and one I do think we need to answer. While isFilterTag allows admins to configure which tags they want to use filtering for, I think you're right that we need to put some more guard rails up depending on data types. I want to take some more time to think about this, but a rough idea is:

type	permissible filtering
string	flexsearch
url	flexsearch
number	equality
date	equality

We have to think about AND vs. OR, and things like this make defining such a query (cleanly, formally) more difficult...

This is really easily overthought, and I'd suggest we just do what we're currently doing. Tag categories are AND'ed together, and searches within those tags are OR'ed together. So if I have a search where I am searching in the inputs tag for gltf and fbx and language is C++, the boolean comparison would be:

(inputs=gltf or inputs=fbx) and language=c++

The user would get results that have either gltf or fbx input support, written in C++.

weegeekps commented 1 year ago

Also, another idea for sorting out the search and filtering query language is to see what we can do with something like Elasticlunr, or even flexsearch which we are using now. Abstracting something out via the UI shouldn't be too difficult.

KhronosGroup / glTF-Project-Explorer

Added the proposal for the new data model. #167