datacleaner / DataCleaner

The premier open source Data Quality solution
GNU Lesser General Public License v3.0
598 stars 181 forks source link

Visually distinguish between source columns and transformed/output columns #482

Open kaspersorensen opened 9 years ago

kaspersorensen commented 9 years ago

We experience that customers/users sometimes get confused with the selections of columns. This is particularly often happen when there are multiple transformers in the job. To make it easier to distinguish between columns it would make life a lot easier if the columns had some symbol or other styling attached to it that would reveal where it is coming from - particularly if it is a source column or an output column of a transformer.

kaspersorensen commented 9 years ago

By the way: This reminds me a lot about issue #15 ... That would maybe be a superior solution?

drexler42 commented 9 years ago

I like the idea of grouping the input columns, as in #15 . That would structure the columns in smaller chunks in the mind of the user. (But with very complicated jobs it could be more difficult still.

Somehow, I feel the section on output columns in DataCleaner should be more prominent. Today, it lives right at the bottom, even below the optional/advanced properties. But, when writing jobs, pruning the output columns that are not needed is important to keep things under control.

The output columns are now named after the component they were created by (e.g. "familyname (merged)" ). With the input columns grouped like #15, maybe the parentheses can go? Because this column would already be listed under Merge component sublist. That would shorten the column names, making them more readable.

kaspersorensen commented 9 years ago

I think we should try and take one step at a time in that regard maybe. Simplifying the output column names is something I think we need to consider on a case-by-case basis because sometimes it could be really confusing to not have a new name I think. The columns also appear in other contexts such as tabular views. It could also be that the same field is manipulated by multiple transformers and thus it would be difficult to tell the different stages of manipulation apart from each other.

kaspersorensen commented 9 years ago

Another question on my mind wrt. #15 is how to handle situations where the column order is changed... If the order is changed so that the sequence of columns is a mix of source columns and transformed columns and then more source columns, then the grouping labels probably would have to disappear or something like that ... Difficult.