TSSlade / google-refine

Automatically exported from code.google.com/p/google-refine
Other
0 stars 0 forks source link

[Wishlist] Seamless conversion of arrays into multiple columns #36

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Unless I'm very much mistaken, there's no direct way to turn an array value 
into multiple columns. 
Using join() requires finding a suitable separator character to split on later 
which requires knowledge 
of the full content of the column.

Perhaps returning an array from a transform should create new columns 
representing the array 
values? I'm not fussy about the specifics, but at the moment arrays seem to be 
2nd class data types.

Original issue reported on code.google.com by AndrewOf...@gmail.com on 14 May 2010 at 2:53

GoogleCodeExporter commented 8 years ago
Yeah, I agree, it would be useful to have your expression return an array and 
have gw treat that as a way to 
build multiple columns... the problem though is that an expression could yield 
arrays with different length on 
each row, which means that gw would have to do this in two passes: the first to 
understand how many 
columns will have to be created in total (taking the max of all the arrays 
returned by applying the expression 
on each row) and the second to create and fill up the cells in the new columns.

Another issue is naming the columns, but we could just come up with random 
names (say columnXX with an 
incremental counter).

Another option is to have some sort of 'column creator manifest', something like

{{{
   value.split(',').make_columns({ "something" : result[0], "whatever" : result[1] });
}}}

but gets very verbose pretty fast.

Thoughts?

Original comment by stefano.mazzocchi@gmail.com on 14 May 2010 at 4:13

GoogleCodeExporter commented 8 years ago
Actually the existing column splitting command already deals with both issues. 
We only 
need to make it take any arbitrary expression that produces arrays.

Original comment by dfhu...@gmail.com on 14 May 2010 at 4:37

GoogleCodeExporter commented 8 years ago

Original comment by dfhu...@gmail.com on 27 May 2010 at 2:11

GoogleCodeExporter commented 8 years ago

Original comment by dfhu...@gmail.com on 27 May 2010 at 2:17

GoogleCodeExporter commented 8 years ago
Does this issue also deal with a simple UI interface for Edit Column / Join ??? 
 For 
example, I have 2 or more columns (first name, last name) that I want to easily 
combine 
in order to reconcile with /person, and I simply just type the column names 
themselves 
with a , separator to handle performing the join upon apply into my new column 
name.  
We have Edit Cell / Join but no Edit Column / Join ??

Original comment by thadguidry on 29 May 2010 at 8:18

GoogleCodeExporter commented 8 years ago

Original comment by iainsproat on 23 Jun 2010 at 6:06

GoogleCodeExporter commented 8 years ago

Original comment by dfhu...@gmail.com on 18 Jul 2010 at 1:03

GoogleCodeExporter commented 8 years ago

Original comment by dfhu...@google.com on 27 Sep 2010 at 10:02

GoogleCodeExporter commented 8 years ago

Original comment by dfhu...@google.com on 27 Sep 2010 at 10:10

GoogleCodeExporter commented 8 years ago
In case anyone is looking for a simple workaround to join two columns (for 
example, joining a firstname and lastname column into a single 'name' column) - 
I found the simplest solution was to export the data from Refine as an Excel 
spreadsheet, and then to use the 'concatenate' function in Excel to join them. 

The concatenate formula (including a whitespace between the firstname and 
lastname) is:

 =A2&" "&B2

I found that for this to work neatly, you should use Refine to trim the 
whitespace from before and after the text strings. Other than that, worked a 
charm!

Original comment by supp...@nickpoole.org.uk on 31 Jul 2011 at 5:44

GoogleCodeExporter commented 8 years ago
Remove obsolete milestone

Original comment by tfmorris on 18 Sep 2012 at 5:20

GoogleCodeExporter commented 8 years ago

Original comment by tfmorris on 18 Sep 2012 at 5:21