Is your feature request related to a problem? Please describe
At work, I am making dashboards related to the work all of our servers are processing at any given time. I am severely limited by how many data points I can add to a given dashboard. Either I need to filter the number of servers, limit the the time window to days or even hours, or severely aggregate the data so we lose all context. I have a prototype using Vega doing what I want to do at work, but our dashboard cannot handle that much data even though Vega doesn't have a problem with it and the dashboard supports Vega.
Describe the solution you'd like
I want to be able to scale up my visualizations. If OpenSearch offered the ability to return data in the Apache Arrow format, we could handle a lot more data on the frontend via Vega. Other visualization technologies on the frontend could also potentially take advantage of Apache Arrow. Here is a discussion of using Vega with Apache Arrow: https://observablehq.com/@theneuralbit/introduction-to-apache-arrow
While we are at it, it shouldn't be difficult to add the ability to ingest Apache Arrow data while I am at it. I am happy to work on the implementation especially if someone can point me architecturally to where the code would go and what interfaces it would need to implement.
Related component
Search:Performance
Describe alternatives you've considered
I have built my own custom visualizations, but it would be nice if opensearch could handle this out of the box rather than me needing to go to another tool
It uses D3 on the frontend and my own columnar format as the data format. As a POC, I was able to display a table of 10 million rows of finance data. I also have a chart in which D3 aggregates all 43 million rows of data. For comparison, Excel has a limit of 1 million rows, and Google Docs has a limit of 10 million cells. Apache Arrow should be a good replacement for my columnar format to make the solution more standard.
Is your feature request related to a problem? Please describe
At work, I am making dashboards related to the work all of our servers are processing at any given time. I am severely limited by how many data points I can add to a given dashboard. Either I need to filter the number of servers, limit the the time window to days or even hours, or severely aggregate the data so we lose all context. I have a prototype using Vega doing what I want to do at work, but our dashboard cannot handle that much data even though Vega doesn't have a problem with it and the dashboard supports Vega.
Describe the solution you'd like
I want to be able to scale up my visualizations. If OpenSearch offered the ability to return data in the Apache Arrow format, we could handle a lot more data on the frontend via Vega. Other visualization technologies on the frontend could also potentially take advantage of Apache Arrow. Here is a discussion of using Vega with Apache Arrow: https://observablehq.com/@theneuralbit/introduction-to-apache-arrow
While we are at it, it shouldn't be difficult to add the ability to ingest Apache Arrow data while I am at it. I am happy to work on the implementation especially if someone can point me architecturally to where the code would go and what interfaces it would need to implement.
Related component
Search:Performance
Describe alternatives you've considered
I have built my own custom visualizations, but it would be nice if opensearch could handle this out of the box rather than me needing to go to another tool
Additional context
Although it uses different technology, here is a prototype of the idea that I built several years ago: https://d2xis0feu0l7hz.cloudfront.net/index.html
It uses D3 on the frontend and my own columnar format as the data format. As a POC, I was able to display a table of 10 million rows of finance data. I also have a chart in which D3 aggregates all 43 million rows of data. For comparison, Excel has a limit of 1 million rows, and Google Docs has a limit of 10 million cells. Apache Arrow should be a good replacement for my columnar format to make the solution more standard.