Closed yumengwang03 closed 2 years ago
@Startrekzky @yumengwang03 Thank for the UI proposal/feedback. I feel there are many months of development history not taken into consideration, however, also our interface is a result of selective business requirements we choose to meet each sprint. So there are features and interactions that need completion as a result. I've noted some responses below to add further context.
- Reduce user flow friction Several text input fields (e.g. GitHub endpoint url) are redundant and should be eliminated for users. Some other text input fields require users to memorize/look up Regular Expression, Cron code, Board IDs, etc. to fill in; it will be more friendly to turn them into dropdown selectors.
Endpoint URL maybe standard for some providers like GitHub, for JIRA it would be custom based on how the instance is deployed. For GitHub we can simply prefill a default value of the current known REST Endpoint for JIRA. I agree there are many visual improvements as well as several ways to design a GUI for Crontab configuration that can be done to make visual scheduling easier for less advanced users. However for advanced users the current interface would still be very practical.
- Correct the imbalance between regular UI mode and Advanced JSON mode at creating pipelines The functions of the regular UI mode doesn't match those of the Advanced mode for which user types in JSON for configuration. For instance, the regular UI mode has missing configuration fields (e.g. Feishu).
There are differences between Advanced and Standard (Visual Mode) intentionally, for instance multi-stage was intentionally left out of standard mode due to engineering decision -- there were plans to add stage support to the main interface. The pipeline provider options were mainly driven by our Data Integrations, as well as some additional plugins were created that were to be "Pipeline-only" plugins (GitExtractor, Refdiff etc). Advanced Mode was created to allow expert features we didn't want to yet incorporate in the visual interface. That being said, Feishu and other Plugins can be added to the main interface as needed.
- Show most relevant details for users in more obvious places (e.g. progression/status of pipeline runs) Lots of status/progression related information is buried too deep. For instance, in the Blueprint list, even unfolding a Blueprint cannot reveal the tasks of that Blueprint.
The Tasks is one sub-component that needs to be added in Blueprint details view once expanded, however due to the limited features we released for the first version of Blueprints this was not yet added.
- Reconsider the order of configuration tasks for each data provider Based on different data providers, we should reconsider the order of configuration tasks. For an example for GitHub, we should let users select repos first before filling in PR and Issue Type options. We can draw a flow chart for each.
Field/Input order can be customized as needed for each Data Provider. Mockups should be made for the preferred presentation of the Provider parameters.
- Disentangle convoluted concepts The concepts of Pipelines and Blueprints are presented in a convoluted way by the UI. We should eliminate the "All Pipeline Runs" page and unify the function of "creating a Pipeline with Blueprint" and "creating a Blueprint with a previously defined Pipeline task" into one place.
I don't agree that they are convoluted, Blueprints reflect a Recurring Plan/Configuration whereas Pipelines represents the historical runs/executions of that blueprint. Users should still be able to execute a pipeline without the overhead of creating a recurring data collection plan. With this proposal to fully merge to the 2 concepts, this would mean that a user is forced to create a Blueprint before being able to run the pipeline, which would mean the user is unable to run an on-demand pipeline. We are already showing related pipelines with the Blueprints, once tasks are displayed with the blueprint and interface changes are made the current approach would make sense.
offer available sub tasks for user to select what to run
offer available sub tasks for user to select what to run
@klesh Yes this can be done, it was requested since this ticket 924 https://github.com/merico-dev/lake/issues/924, however never prioritized during sprint planning. The Backend would also need to provide a new set of API endpoints [GET] /plugins/task-options/github
for example, that can provide a list of available tasks for each plugin/provider, and also which tasks if any should be enabled by default.
can we show button for load config in advanced mode instead of clicking date
can we show button for load config in advanced mode instead of clicking date
@warren830 We do have 2 access points already for loading configurations. The Pipeline Name Menu and the Settings Gear on the Tasks Editor Panel.
to create issues for this epic
@yumengwang03 @e2corporation @klesh I got some feedback from end users and from our own experience of setting up demo instances. This feedback is not just for config-ui, but also for the configuration feature itself.
GitHub/GitLab users have to follow a long doc to collect full data from GitHub/GitLab. They have to create a pipeline for both Github and GitExtractor plugin, which is troublesome.
HIGH
GitHub users have to configure complicated RegEx to convert labels. This has two problems: a) For general users who only consume Github Basic Dashboard, they don't have to configure. Therefore, these users don't have to see the configuration in the connection page. b) For advanced users who consume 'release-based dashboard' or other dashboards that rely on label conversion and pr-issue mapping, it's hard for them to configure the right RegEx. Because unlike RegEx learning sites or Grafana variables that can show you the results immediately, config-UI cannot tell if user's RegEx is correct or how the labels will be converted. Users are very likely to populate the wrong RegEx and have to figure out what is the right RegEx, and then re-convert the data again.
HIGH
Users may collect useless data, which will affect data collection speed and the metrics in pre-build dashboards.
MEDIUM
Configuration rules for GitHub plugin apply to all repos collected, Github maintainers cannot configure each specific repo.
LOW
. The reason is, compared to the previous problems, I think it's not very common. Many maintainers who have more than 1 repo, such as PingCAP, use the same label format across different repos. In this case, one global setting applying to all repos works.@hezyin @Startrekzky @klesh and I had a discussion on the next-step improvements for Config UI, summarized in the following form. The solution and priority columns are open for discussion.
Issue | Description | Solution | Priority |
---|---|---|---|
1. When creating a Blueprint, design a more user-friendly orchestration of data source configuration | Users care about what data are collected, rather than how data are collected. The purposes of GitExtractor and RefDiff are not familiar to users. | We can hide GitExtractor from the UI, but if a user has selected GitHub or GitLab as data sources, we automatically enable GitExtractor to collect data. I'm not sure about RefDiff; It seems like an additional function that can be enabled under GitHub/GitLab. | High (because the current orchestration causes the most user confusion) |
2. When creating a Blueprint, allow users to select data cope and data entities | We want to adapt the task flow more closely to users' intention by allowing them to select which data entities they'd like to select for a particular data source, at the same time introducing our domain layer concept to them without causing too much learning cost. | Two solutions for discussion:
|
Medium |
3. Move transformation rules (the RegEx section) from Data Integration to Creating Blueprints and replace RegEx with a UI that gives feedback to users immeidately | Three reasons: |
Move the transformation rules into Creating Blueprints, putting it after selecting the data scope. | High (because this is a major hurdle for advanced users) |
4. Redesign a more reasonable workflow of creating a Blueprint | We want users to create a Blueprint, and then all runs will be automatically generated as records under that Blueprint. Thus, a Blueprint should: 1. contain the data scope, data entities and transformation rules of all applied data sources; 2. have a running frequency (manual or recurring). | High (this affects the entire perception of what a Blueprint is and all tasks associated with it ) |
@yumengwang03
For issue #1, the refdiff
provides some difference calculation algos between two ref
(tag/branch), like:
ref
s, 1 older, another newer, calculate how many commits had been committed since older
to newer
, so, user can see how many commits between 2 releases.Each refdiff
subtask takes 2 input, old_ref
and new_ref
, it then would calculate the differences and store the result into refs_issues_diffs
ref_commits_diffs
and refs_pr_cherrypicks
accordingly.
@yumengwang03 @Startrekzky
For issue #2, I would like there are somewhere in-between the process, we can pick specific subtasks
that I wish to run.
will be addressed by #1862
Description
Yumeng thinks the current Config UI has a series of problems/potential improvements, so she'd like to share some thoughts here for discussion:
Objectives of improvements
How do we improve?
1. Reduce user flow friction Several text input fields (e.g. GitHub endpoint url) are redundant and should be eliminated for users. Some other text input fields require users to memorize/look up Regular Expression, Cron code, Board IDs, etc. to fill in; it will be more friendly to turn them into dropdown selectors.
2. Correct the imbalance between regular UI mode and Advanced JSON mode at creating pipelines The functions of the regular UI mode doesn't match those of the Advanced mode for which user types in JSON for configuration. For instance, the regular UI mode has missing configuration fields (e.g. Feishu).
3. Show most relevant details for users in more obvious places (e.g. progression/status of pipeline runs) Lots of status/progression related information is buried too deep. For instance, in the Blueprint list, even unfolding a Blueprint cannot reveal the tasks of that Blueprint.
4. Reconsider the order of configuration tasks for each data provider Based on different data providers, we should reconsider the order of configuration tasks. For an example for GitHub, we should let users select repos first before filling in PR and Issue Type options. We can draw a flow chart for each.
5. Disentangle convoluted concepts The concepts of Pipelines and Blueprints are presented in a convoluted way by the UI. We should eliminate the "All Pipeline Runs" page and unify the function of "creating a Pipeline with Blueprint" and "creating a Blueprint with a previously defined Pipeline task" into one place.