Businesses and governments have a lot of data, and want to learn about structures and patterns in the data. This might include being able to make predictions extending from the data.
Companies, research organizations, governments, etc. often collect data/observations containing timestamps. It is useful in many cases to find trends or patterns in data over time, including the possibility to forecast future trends. These types of analyses fall under the umbrella of time series analysis.
Specific examples include:
enterprise resource planning and management
environmental monitoring
https://blog.newrelic.com/2016/11/16/dynamic-baseline-alerts/
We can search for values that stand out of the normal range, or variance, in medium or large data sets. These 'abnormal' data may point to problems or unique conditions, which need attention. Anomoly detection algorithms can help decision makers quickly find unusual segments of data.
Specific examples include:
There are myriad tools to help people design Machine Learning workflows. However, there does not appear to be a visual programming environment with machine learning primitives.
Build a general purpose machine learning programming environment that is accessible by a REST API and web user interface.
The design will most likely consist of a REST API and User Interface, developed as separate components.
A REST API would make it easy to use Machine Learning algorithms, since users would not have to install or maintain the ML software.
The API might be structured to mirror the Orange3 User Interface. Specifically, the Orange3 UI has the following structure:
The user interface for machine learning algorithms will make it easy for people with little programming experience to build machine learning services. The UI should include interface for interacting with data, sequencing ML tasks, and accessing output. It might also include basic visualizations to give users insight into data (histogram, etc)
UI contains features such as:
It is worth building on top of existing tools, to make our work more focused. This section outlines relevant tools for building the idea as easily as possible.
Orange3: machine learning user interface with drag and drop modelling, visualization, data management and more.
While Orange3 has a user interface, it is based on the Qt framework. This design decision means Orange3 is primarily relegated to Desktop usage. It may be desirable to build a web native user interface, so that no end-user download is necessary (aside from a web browser) to use the software .
Protocols and Structures for Inference Machine Learning as a Service - an architecture for presenting machine learning algorithms, their inputs (data) and outputs (predictors) as resource-oriented RESTful web services in order to make machine learning technology accessible to a broader range of people than just machine learning researchers.
To build out the overall user interface, we can select an existing JS UI framework, such as:
Following the conventions in the Orange3 user interface, ML sequences can be modeled as data flows. To facilitate this type of modelling/interation, we can build on an existing JavaScript UI framework such as the following:
There are some programming environments that support a flow-based visual workflow. The following examples are open-source, and run in aweb browser:
For similar reasons as the user interface, the visualization framework should be based on web standards.
A discussion was opened in the Orange3 repository related to open-source, web-based data visualization frameworks.
Proposals for the data visualization framework include:
Altair - Altair is a declarative statistical visualization library for Python, based on the powerful Vega-Lite visualization grammar.
Bokeh - Bokeh is a Python interactive visualization library that targets modern web browsers for presentation.
Matplotlib D3 (mpld3) - The mpld3 project brings together Matplotlib, the popular Python-based graphing library, and D3js, the popular JavaScript library for creating interactive data visualizations for the web.