-
```
What steps will reproduce the problem?
1. Windows 7 x64 with 8GB of physical memory
2. Large csv file (733mb)
3. In google-refine.l4j.ini, set
# max memory memory heap size
-Xmx4096M
4. Save …
-
```
What steps will reproduce the problem?
1. Windows 7 x64 with 8GB of physical memory
2. Large csv file (733mb)
3. In google-refine.l4j.ini, set
# max memory memory heap size
-Xmx4096M
4. Save …
-
```
What steps will reproduce the problem?
1. Windows 7 x64 with 8GB of physical memory
2. Large csv file (733mb)
3. In google-refine.l4j.ini, set
# max memory memory heap size
-Xmx4096M
4. Save …
-
```
What steps will reproduce the problem?
1. Windows 7 x64 with 8GB of physical memory
2. Large csv file (733mb)
3. In google-refine.l4j.ini, set
# max memory memory heap size
-Xmx4096M
4. Save …
-
```
What steps will reproduce the problem?
1. Windows 7 x64 with 8GB of physical memory
2. Large csv file (733mb)
3. In google-refine.l4j.ini, set
# max memory memory heap size
-Xmx4096M
4. Save …
-
```
What steps will reproduce the problem?
1. Windows 7 x64 with 8GB of physical memory
2. Large csv file (733mb)
3. In google-refine.l4j.ini, set
# max memory memory heap size
-Xmx4096M
4. Save …
-
```
What steps will reproduce the problem?
1. Windows 7 x64 with 8GB of physical memory
2. Large csv file (733mb)
3. In google-refine.l4j.ini, set
# max memory memory heap size
-Xmx4096M
4. Save …
-
```
What steps will reproduce the problem?
1. Windows 7 x64 with 8GB of physical memory
2. Large csv file (733mb)
3. In google-refine.l4j.ini, set
# max memory memory heap size
-Xmx4096M
4. Save …
-
```
What steps will reproduce the problem?
1. Windows 7 x64 with 8GB of physical memory
2. Large csv file (733mb)
3. In google-refine.l4j.ini, set
# max memory memory heap size
-Xmx4096M
4. Save …
-
*Original comment by @markharwood:*
In datasets like Panama papers the issue of noisy duplicate data raises its head and is a major pain.
Consider the near-duplicate names in this real example:
!LINK…