opentraveldata / geobases

Data services and visualization
http://opentraveldata.github.com/geobases/
Other
193 stars 41 forks source link

Tricky duplicates bug #3

Closed alexprengere closed 11 years ago

alexprengere commented 11 years ago

The way duplicates are handled today, creating key@nb duplicated keys, makes the following behavior a bug.

Suppose we have a file containing keys that are already formated with keys looking like duplicated keys:

$ cat file.csv
NCE@1
NCE
NCE

Now pipe that into GeoBase:

$ cat file.csv| GeoBase -q -s __key__   
#__key__
NCE@1
NCE

There is a missing line from this output, because the third line created a duplicate for NCE, called NCE@1, which overrided the data for the first line:

alexprengere commented 11 years ago

Fixed with f9828821a7ea30598d67ba36fc58f7dc1f79678c. Now:

$ cat file.csv
NCE@1
NCE
NCE
$ cat file.csv |  GeoBase -q -s __key__ H0
#__key__^H0
NCE@2^NCE
NCE@1^NCE@1
NCE^NCE

The generated duplicated key for the second NCE was NCE@2 because NCE@1 was already taken.