hamidaayoub / google-refine

Automatically exported from code.google.com/p/google-refine
Other
0 stars 0 forks source link

Data set joins #96

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Feature suggestion: Data across multiple Gridworks projects could be reconciled 
to each other.  This would allow the data set joins, and augmentation of data 
(i.e. similar to 'add data from Freebase') across projects.

This requires some more thinking on how Gridworks can accomodate this 
internally.

Original issue reported on code.google.com by iainsproat on 23 Jun 2010 at 5:20

GoogleCodeExporter commented 8 years ago
Iain, this is supported through the cross GEL function. Could we work together 
to verify if it does what you want?

Original comment by dfhu...@gmail.com on 18 Jul 2010 at 1:21

GoogleCodeExporter commented 8 years ago

Original comment by dfhu...@google.com on 28 Sep 2010 at 4:59

GoogleCodeExporter commented 8 years ago
There is something a bit screwy about the cross function in GREL in at least 
all versions I've tried up to v2.1 r2098 and on OSX10.5.8

The problem is not entirely straightforward to reproduce, but it goes like 
this. An entirely correct cross expression will return an "Error: Cannot 
retrieve field from null" when it is used on a project that has been receiving 
some work. 

** However, stopping refine, waiting a bit (which seems to be the important 
bit), and then restarting the app allows the exactly the same cross expression 
to work fine.  

It is not sufficient to force quit refine and restart it once, or even twice, 
the problem will persist, nor is it sufficient to rename projects or remove 
column heading whitespace the any of the other voodoo I've had cause to try. 

Original comment by ioeas...@gmail.com on 30 Jun 2011 at 2:57

GoogleCodeExporter commented 8 years ago
google refine 2.5 RC2 still carry this issue. 

Work around: export the project, do a vlookup in Excel and create a new project 
in refine.

Original comment by Martin.M...@gmail.com on 2 Dec 2011 at 1:09

GoogleCodeExporter commented 8 years ago
I am also facing this issue and this is really causing lot of pain for me. Any 
workaround apart from excel?

Original comment by rockey.n...@leapgradient.com on 2 May 2012 at 7:10

GoogleCodeExporter commented 8 years ago
Does anyone have a reliable way to reproduce this problem repeatably?  I can 
take a look at it, but I'd rather spend my time fixing the bug than trying to 
figure out how to reproduce it.

Original comment by tfmorris on 9 Jun 2012 at 2:52

GoogleCodeExporter commented 8 years ago
Tom,

From my experience and what others are posting it looks like this is a 
completely random bug and it has no relation with a particular data set. Could 
it comes from local settings?

Martin

Original comment by Martin.M...@gmail.com on 9 Jun 2012 at 2:56

GoogleCodeExporter commented 8 years ago
I experienced this as well. It occurred on my first use of Google Refine. 
Closing and restarting fixed the problem.

Original comment by david.a....@gmail.com on 19 Jun 2012 at 3:01

GoogleCodeExporter commented 8 years ago
I followed the steps in the example at the bottom of GRELOtherFuntions page and 
have the same error. Even closing and reopening Refine the cross function 
doesn't work.

Original comment by ruben.v....@gmail.com on 23 Aug 2012 at 11:12

GoogleCodeExporter commented 8 years ago
I figured out why isn't working for me: I pass the tables from the referred 
page to Refine by using copy/paste, this make that some values include one 
leading space, so the key values were not identical. Since the tables are small 
i remove manually the spaces. Now is working fine.

Original comment by ruben.v....@gmail.com on 24 Aug 2012 at 3:09

GoogleCodeExporter commented 8 years ago
Same problem here ! It is very random, but I observed that I had the problem 
when trying to merge large datasets (+500k rows). 

Original comment by mar...@fifty-five.com on 30 Aug 2012 at 3:19

GoogleCodeExporter commented 8 years ago
Simple exact match joins are supported by cross(), but cross project 
reconciliation with a UI similar to reconciling against Freebase still isn't 
supported, so I'll leave this feature request open.

General questions (Ruben) should go in the Google Group.

Please use issue 432 for discussion of the cross() caching problem and open 
separate bug reports for unrelated issues.

This issue should stay focused on the enhancement request.

Original comment by tfmorris on 30 Aug 2012 at 3:35

GoogleCodeExporter commented 8 years ago
I've been working with Google Refine for most of two weeks (90+hours) and 
everything is working really well except for the random failure of the cross 
function.   I've burned at least 1/2 of my time trying to get cross to behave 
consistently.  I've done all the workarounds I've read about, but I absolutely 
cannot get cross to work for certain cells.   "JNB01" works fine, but "JEN01" 
doesn't.  Even manually re-entering the data in both projects doesn't seem to 
help!   The project I'm trying to make updates to is 44k rows, and the project 
I'm looking up data from is 2.6k rows.  Doesn't seem overly large.  Any 
estimate on when an updated cross function will be released?

Original comment by jbr...@gmail.com on 26 Jul 2013 at 6:40