ddavisqa / google-refine

Automatically exported from code.google.com/p/google-refine
0 stars 0 forks source link

Ability to export transformations #385

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Very cool project, like a spreadsheet for the 21st century.

It would be great to be able to export a list of the exact transformations 
executed on a data set. 

It would be even greater to export Java/Ruby/Python/Javascript code that would 
perform the same transformations.

Original issue reported on code.google.com by justin.w...@gmail.com on 20 May 2011 at 3:09

GoogleCodeExporter commented 8 years ago
I meant to flag this as an enhancement, sorry for the confusion

Original comment by justin.w...@gmail.com on 20 May 2011 at 3:10

GoogleCodeExporter commented 8 years ago
Concerning your first question, exporting a list of the transformations...
Look at Video 2 here http://code.google.com/p/google-refine/  at around 7:40 
time frame which shows how.

Original comment by thadguidry on 20 May 2011 at 3:46

GoogleCodeExporter commented 8 years ago
Nice, I had missed that part of the video.

I think it would be handy if in addition to exporting a JSON snapshot of 
transformations, you could export code that would execute those operations.  
I'm imagining the ability to dump out a shell script, so that you could

cat similar_data_set | refine_generated_parser.rb --use 
refine_generated_transforms.json > reformated_textfile_csv_etc.txt

If the underlying data format chances, you could throw the json back into 
Refine as shown in the video to update.

Some way of sharing around transformations (not sure if Freebase does this) 
would also be cool.  So you could throw in a --use 
http://google-refine/transforms/wiki-table-actors.json

Justin

Original comment by justin.w...@gmail.com on 20 May 2011 at 4:11

GoogleCodeExporter commented 8 years ago
You might want to review the discussion archives for past conversations about 
batch/headless running of Refine operation histories:

http://groups.google.com/group/google-refine/browse_thread/thread/ae6ed4a829f0ec
e9/edbe458ad1070de9
http://groups.google.com/group/google-refine/browse_thread/thread/19f4b56df08126
8c/f6fee660946b80d0
https://groups.google.com/forum/#!topic/google-refine/RAqB4rY_3GE/discussion

Original comment by tfmorris on 20 May 2011 at 5:22

GoogleCodeExporter commented 8 years ago
Will do, thanks.  Feel free to close the issue if redundant.

Original comment by justin.w...@gmail.com on 20 May 2011 at 5:26

GoogleCodeExporter commented 8 years ago
Take a look more specifically at 
https://github.com/PaulMakepeace/refine-client-py/blob/master/google/refine/refi
ne.py#L294

That method will directly take exported transforms etc.

If you ask on the list I can provide some more help & improve the docs. 

Original comment by paulm%pa...@gtempaccount.com on 20 May 2011 at 5:27

GoogleCodeExporter commented 8 years ago
I'm going to close this.  If, after reviewing currently available options, you 
find that you still have an unmet need which falls within the scope of the 
Refine, feel free to create a new enhancement request.  If you have multiple 
requests (e.g. export + batch ops), please create a separate enhancement 
request for each.

If you're unsure about whether something might be possible with the current 
software, the email list (Google Group) is a great place to get help.

Original comment by tfmorris on 20 May 2011 at 5:55