Implement new UI changes to support new Lucene

louise-davies commented 2 years ago

Description: Follows on from #1361 - certain UI changes we like what Patrick has done and so we want to port them over exactly - the sort and my data checkboxes in the search controls, removing of the filters from the tables (and removing some padding). These need to either be sorted by #1361 or re-implemented.

Acceptance criteria:

[ ] Patrick's UI changes that are fine as is (i.e. search box controls, removal of table header padding) are implemented

patrick-austin commented 2 years ago

I've rewritten the help text to reflect new features:

Synonyms (and stemming)
Ability to target fields (this was explicitly discouraged in previous material, I suspect because previously we were concatenating all ICAT fields into one Lucene field called text so it had limited value)

and clarify some finer points that came up in the user meetings:

How common filepath characters (. - / \) behave
How wildcards interact with phrase quoting

Ultimately we might decide that this needs less (or more) detail, or a general re-wording. But hopefully it contains all the underlying points that we want to make. We also discussed having video material, so we might want to make this consistent with that.

Finally, I've summarised the points that can either lead to timeout errors or syntax errors in the search engine (with the intent of displaying this to the user in the event of either error). I wasn't sure how errors are handled so I'll just paste it here (again happy for it to be reworded if needed):

Unable to complete requested search in under {} seconds. To ensure searches complete quickly, please try:
- Only searching "my data"
- Only searching the type of entity you need results for
- Using less wildcard characters in the search term(s)
- Making the search term(s) more specific
- Using the default relevancy based sorting

Please refer to the full help for search syntax, or try:
- Replacing \ characters with spaces (unless being used to escape another special character)
- Surrounding text containing other special characters with double quotes

louise-davies commented 2 years ago

New requirement: Do not display My Data checkbox if the user is logged in as anon

louise-davies commented 2 years ago

New requirement: have a tooltip on my data checkbox to explain it

louise-davies commented 2 years ago

New requirement: my data checkbox is only actioned when the user presses search

louise-davies commented 2 years ago

@mattjSTFC have you had a chance to take a look at the filter panel UI for this? As we think that's the only work outstanding on this issue other than getting the PR reviewed :)

mattjSTFC commented 2 years ago

@louise-davies Sorry, no, but I'll make this a priority.

What is the state of play re filters from a coding point of view? As in was the agreed plan to get them working even though they may offer little value at the moment?

I think what I am trying to ascertain is how much time to spend on it? 😆 Do we need to have it precisely laid out?

louise-davies commented 2 years ago

@agbeltran suggested that she preferred that we don't ship with it "incomplete" and instead try and get all the functionality out now and working/looking as intended - as this may encourage the facilities to improve their metadata if we have support for everything built in.

mattjSTFC commented 2 years ago

@louise-davies Do we have an idea of what filters will be shown? Plus any options available within each filter.

I have screenshots from Patrick's demo to help but this was based on enhanced metadata. Any clues on what can be realistically shown, with the current metadata, would help aid me setting up the wires. 👍

louise-davies commented 2 years ago

@patrick-austin can you help with this? I suppose this isn't super simple as we can't just point at the prod/preprod data as they're not running the correct lucene...

patrick-austin commented 2 years ago

From my perspective there are three aspects to this: what has backend support, what actually exists in the metadata now, and how we choose to represent that.

InvestigationType.name, DatasetType.name, DatafileFormat.name, DatasetTechnique.technique.name (new in ICAT 5), and the various parameter names and values (SampleParameter.type.name, InvestigationParameter.stringValue, DatafileParameter.numericValue, DatasetParameter.dateTimeValue etc.) are all facetable, and could be returned if set.

In terms of what actually exists, I have more knowledge of the DLS metadata than ISIS. But I wouldn't expect any techniques (since they're new). For the type and format names, there are unlikely to be more than 10 possible values, and most of the data has the same value ("experiment" or "experiment_raw" seems to dominate). The parameters will depend more on the facility. DLS has around 10 possible InvestigationParameters which have simple string values (one is "compulsorily_remote", "COVID 19" is one, stuff like that). ISIS I think has two InvestigationParameters as well, but they mainly use DatafileParameter, with a mixture of stringValue and numericValue. One Datafile can have a lot of these, at least 10s maybe over a 100. The names and meanings are more technical than the DLS ones which are just tags. When I mocked the data I chose a few real names to give fake values to, but expect things like:

        parameters:
        - type: ParameterType_name-end=5Fdate
          stringValue: 2015-07-21 12:02:54
        - type: ParameterType_name-time=5Fchannel=5Fparameters
          stringValue: 10 4800000 2000
        - type: ParameterType_name-run=5Ftitle
          stringValue: horizontal slit scan8
        - type: ParameterType_name-number=5Fof=5Fdetectors
          numericValue: 790
        - type: ParameterType_name-good=5Fframes
          numericValue: 27950
        - type: ParameterType_name-run=5Fduration
          numericValue: 950
        - type: ParameterType_name-primary=5Fflight=5Fpath
          numericValue: 0.0
        - type: ParameterType_name-good=5Fproton=5Fcharge
          numericValue: 9.5

It's worth noting that I've never seen anyone using the dateTimeValue (indeed ISIS store date/time data as a stringValue instead), and that ESRF use DatasetParameters.

we can't just point at the prod/preprod data as they're not running the correct lucene...

While that's true, we set @kennethnym up to point DG at the VM I was using to test performance on. That has real Diamond data indexed into icat.lucene from a snapshot of their DB. So you could use that too, to experiment with as needed.

Finally there's the question of how we display this. Ultimately DG can do what it likes obviously, the data it recieves for types and formats (and techniques) is a simple list of strings, with the number of times each appears in the search data (this might be hardcoded to give the top 10, but that number could be altered if needed). The implementation I demoed is to have this as a list, and clicking on one filters to that value. However a drop down or even an open text box you type in and then it shows the relevant possible values based on the input might be more suitable. For parameters it's slightly more complex, you perform the same process with the parameter name (i.e. this Datafile must have a recorded "duration"). But to filter based on the value, you need to specify the name and some control on the value. For strings, I demoed this as a list you select from just like the types. I also did this with date values, by simply binning the values into "2022", "2021", "2020" "2019 or earlier". However it's very hard to know if this is suitable or not as no-one uses these. I could imagine this approach working for say, the date you got funding, or the date of publication, but if its the timestamp of when the first neutron pulse was emitted than knowing it occured sometime before 2019 is useless. Under the hood the date is just the number of ms since epoch, so we can treat this as a number if we want. Which brings us to numericValues. The demo had a box for min, box for max, and optional units. This maybe isn't as slick as a slider, or binning the values into (for example) 0-10, 10-20, 20-30, 30+, but it's very hard to know what sensible bins would be (even more so than the dates).

At the end of the day my demo implementation was a trade off between what looked nice and what I thought I could implement before actually needing to demo to users. If as UI and frontend experts, you decide a different approach is warranted, then by all means go with that.

Hopefully that's enough detail, but if any of that doesn't make sense let me know and we can discuss further.

louise-davies commented 2 years ago

@kennethnym could you post some screenshots/GIFs of what the current filter panel UI looks like so that Matt has something to visualise?

kennethnym commented 2 years ago

Will do

kennethnym commented 2 years ago

https://user-images.githubusercontent.com/19359984/191964122-c3b140ad-9a6a-4513-b5b1-2a3bba3c6b5c.mp4

louise-davies commented 2 years ago

@kennethnym could you get a video at the Investigation/Dataset level, as I think there should be actual facets/parameters on there right?

kennethnym commented 2 years ago

I don't think they both have a filter panel, unless I've accidentally removed it at some point and I forgot about it :sweat_smile:

patrick-austin commented 2 years ago

Filter panel was originally in CardView. I artificially bolted it onto the side of the Datafile table for the purposes of the demo (ISIS use DatafileParameters, and there is no CardView for Datafiles). I didn't alter DatasetTable or InvestigationTable since there wasn't a need to. So unless it has been removed, should be visible for investigations in CardView

mattjSTFC commented 2 years ago

If we keep the filter section on the left of the data table within the tab it could be styled as follows:

No filters set

Multiple filters sets

To limit the amount of screen real estate lost by adding the filter panel, left of the table, we need to ensure that the least amount of space is used. Obviously without it looking to dense and unimportant
I would also suggest for Type, not having the icons, and converting to text so it is easier to understand the options.
I have also added a close icon to remove the filter
The content of the tab would need some padding added to allow the filter panel and data table to breath
Filter panel would need some right hand margin that matches the padding
As the table is 100% width I have added a border to contain it (I haven't added the table itself as would take some time to do - you'll have to visualise it for now).

Shout if I have missed anything, things that won't work or ideas to improve.

patrick-austin commented 2 years ago

I like the look of the selectors and X for removing all being on a line, and I definitely think replacing the icons with text for the type selector is the right thing to do (to be honest, I can't remember now why I didn't just use text in the first place).

There are some things I'd want to clarify as they don't appear in the mock ups (this might just be because we're happy with them as they are, and this comment will be overkill, but I'd rather that than they slip throught the cracks). For reference, here's screenshots from one of the demos I did as it has more of the facetable fields populated than @kennethnym 's video:

Where would we put the other facetable fields (e.g. Type, Parameter Name, Sample Name in the first image)? If the button to add a Parameter Value filter exists on the same level as the main "Filter" heading, would we then put these below a section for the Parameter Value filters?
Are we happy with the current method of providing a range for numeric values (first image)? Unless this has been changed since the demo, it would be as above with inputs for min/max/units.
Are we happy with the current method of selecting string values (second image)? This would be the case for string parameter values (like run_title), and also fields like Type, Sample Name and Parameter Name.
Are we happy with the current method of selecting date values (not pictured)? As I mentioned in my last comment, we currently do this by binning the dates based on year but another approach might be better. Visually this would look like the string selection in the second image.
Will the cross in the panel for removing a parameter value filter replace the chips currently being placed above the table/card view once a filter is applied (second image), or would both be present? Is there a different strategy for removing the other filters (like "Parameter Name - run_title)?
What happens if the panel gets too long (second image)? For my demo getting the panel scrollable independently of the table was a bit of a pain, so I'm sure there's a cleverer way of doin this. Note this is trivial in card view as we can scroll the whole page at once and let the panel be as long as it needs to.
How are we going to limit the panel's width? In principle, the string value of a parameter can be 4000 characters. Are we setting a fixed width to the panel, or will it dynamically resize itself (between some min and max limit). How will it behave at small sizes (I had it take full width below a certain screen size, but this then breaks the scrolling)?
What happens when there aren't any parameters? In @kennethnym 's video we see empty drop downs because there are no parameters to facet. We could remove the button to add the value filters in this case, but maybe a user wouldn't understand why the button just disappears sometimes? For context, if we do not have any values for fields like in image one, that entire expandable filter would simply be missing from the panel.

mattjSTFC commented 2 years ago

@patrick-austin Great feedback, thank you.

What would help me greatly in order to answer some of the questions, from a UX/UI design point of view, and to suggest any further improvements is to have access to a staging / development site with the filters working using the current metadata. Having access to videos is great but it isn't providing the full picture of all scenarios.

@kennethnym Do you know if this is possible?

patrick-austin commented 2 years ago

Before @kennethnym uploaded the video @louise-davies was asking if we could basically do that, deplot DG on the cloud VM that's currently running the new icat.lucene and icat.server components on the Diamond metadata. This is the same VM that Kenneth has been developing against (as far as I know).

The downside there is that's only got Diamond metadata, I don't have access to real ISIS metadata anywhere (they don't have as much data as Diamond, so wasn't interested in testing the performance on theirs). And ISIS have stuff that Diamond don't, crucially (Datafile)Parameters with numericValues set.

mattjSTFC commented 2 years ago

Thanks @patrick-austin. I'll wait for news on a site that can be accessed before making any further suggestions 👍

ral-facilities / datagateway

Implement new UI changes to support new Lucene #1363