opensearch-project / OpenSearch-Dashboards

📊 Open source visualization dashboards for OpenSearch.
https://opensearch.org/docs/latest/dashboards/index/
Apache License 2.0
1.68k stars 884 forks source link

[Feature Request] Field list does not reflect all the available fields for the selected index. #4439

Open tibz7 opened 1 year ago

tibz7 commented 1 year ago

Describe the bug In Discover not all the fields present in the index pattern are present. For instance one of my field (log_level) is in the dataset (and mapped in the index pattern), but it does not appear in the list of available fields. however If if filter using this field it works.

These issues are a bit different, but somehow similar, it feels to be a general UX problem about displaying information . I write them here, if you think that I should open a separate issue for these, I will.

To Reproduce Steps to reproduce the behavior:

  1. Go to Discover
  2. Click on your index pattern
  3. Search for one of your field.
  4. some fields are presents in available fields, some others not.
  5. filter with an "non present field" ... it works!

Expected behavior I would expect to see all the fields present in that index pattern

OpenSearch Version 2.7

Dashboards Version 2.7

Additional context The index pattern where I noted the bug is an index pattern grouping other index patterns with different fields. IIn one index patterns fields are starting with php_ in the other some with nginx_. I don't know if thats a coincidence or not, but I assume it worth to be mentioned. Each of the index dont have many fields, maybe 8 each. I dont have an index with say 16 fields to see if thats a display problem due to the number of fields.

ashwin-pc commented 1 year ago

Hi @tibz7 thanks for opening the issue, I have a feeling your main issue and the second issue are related to the same quirk, that discover only loads the data based on the first 500 documents retrieved. So for the main issue here the reason you arent seeing the fields is because in top 500 documents, the field that you care about isnt present. As for the summary view, the same applies. The top 500 documents do not contain and other occurance other than warn. That being said the second issue is covered in #1995. But ill keep this issue open to get suggestions on what to do about the primary issue.

As for the 3rd issue mentioned can you open a different issue for it. Its unrelated to the two but a worthwhile callout. I know that OSD isnt consistent about showing the same data sources. If you can open an issue highlighting the differences that you have seen, it will help some of the upcoming decisions around a consistent datasource selector.

Renaming the issue to highlight the primary issue better

tibz7 commented 1 year ago

aaah ok it makes sense, thank you! I will open an new issue for the selector

ashwin-pc commented 1 year ago

cc: @KrooshalUX @kgcreative @dagneyb.

AMoo-Miki commented 1 year ago

I wouldn't call this a bug as it is by design. It certainly is poor UX but I will let @dagneyb decide the fate.

The dialog already indicates that a subset of the data was analyzed and maybe the wording can change to ... / top 500 results when there are greater than 500 results shown in Discover.

dagneyb commented 1 year ago

@AMoo-Miki @ashwin-pc @tibz7 I do agree that we should at a minimum call out the limitation as "top 5 values" is not completely accurate. I'm thinking we could add a subtext under "Top 5 Values" to say "within the first 500 documents". For available fields, why was the decision made to limit to top 500 documents? Again, its not accurate to simply state "available fields" as I think that infers the list is comprehensive, so we may need to add subtext there as well, or fix it to expand beyond 500, but I'd like to fully understand why this limitation exists first.

kgcreative commented 1 year ago

@dagneyb I would be on board with saying "Exists in 500 / 500 sample records" -- just that small terminology change will help clarify that this is not all records, but just a sample. I believe this setting is configurable via the discover:sampleSize setting in advanced settings. @AMoo-Miki or @ashwin-pc, can you confirm if that's the case?

edit: Is there apetite in expanding how samples are done/configured? I can imagine being able to configure different sampling strategies. for example, most recent 500, random 500, some other sampling algorythm.

tibz7 commented 1 year ago

@dagneyb aah thank you for the explanation! for the top 5 i think it makes sense to specify within 5000. For the available fields i think it would be better to have them all available, even if that includes some lazy loading. if not possible, some clearer message would do.

kgcreative commented 1 year ago

image @tibz7 - in the "Filter by type" drop down, you can uncheck "hide missing fields" -- and that will show all fields from the index pattern instead of just the fields present in the currently selected sample

ashwin-pc commented 1 year ago

@kgcreative while i like this idea, can we take it a step further and reduce the transparency of the fields that are missing when hide missing fields is turned off?

kgcreative commented 1 year ago

@ashwin-pc I like this approach, yes. "Hide missing fields" is existing functionality, so having a better indicator when those fields are missing in the sample would be good. Maybe with a call-out to expand the date range or change the sampling method?

ashwin-pc commented 4 months ago

Updating the issue as an enhancement since the main issue called out here already exists and it is how its exposed to the user that needs to improve.