LibreCat / Catmandu

Catmandu - a data processing toolkit
https://librecat.org
177 stars 31 forks source link

Simple mapping table as fix #67

Closed nichtich closed 3 years ago

nichtich commented 10 years ago

Many data conversion tables consist of simple mapping tables given as spreadsheets/csv. Fields included in a mapping table are moved/renamed andy fields not notnot included are removed. How about using csv as input format for such mapping fixes?

vpeil commented 10 years ago

nice idea. I'm working on this.

vpeil commented 10 years ago

A simple mapping looks like

old, new x, y z, a

@nichtich I suppose you had also this in mind (a mapping which works on paths):

old, new title.originalTitle, mainTitle year, publishingYear.iso

Correct?

nichtich commented 10 years ago

Yes, but maybe without header:

title.originalTitle, mainTitle
year, publishingYear.iso

Applying this header would create a record with at most two fields, "mainTitle" and "publishingYear" where the latter has one field "iso".

The mapping could also make use of another fix, e.g. marc_map but move_field is enough to start with.

vpeil commented 10 years ago

I see, marc_map would be a better choice, but then we have a dependency on Catmandu-Marc. Maybe this should be included somewhere else?

vpeil commented 10 years ago

wait, until monads are released, should be easy then. current status, is in https://github.com/vpeil/Catmandu/tree/mapping

vpeil commented 10 years ago

@nichtich: the latest catmandu release contains bind-hashmap. As I can see, that's it. Close this issue?

nichtich commented 10 years ago

No, bind-hashmap cannot be used this way as it uses an additonal hashref instead of working on the normal items. The mapping table should be provides as has, not generate one. See another use case here: https://github.com/gbv/unapi-catmandu/blob/1e83bf31fa9ff4ec60ea9506328a6390bef71574/doi.psgi#L10 (here aref_mapping uses a mapping table to apply multiple aref_query).

nics commented 6 years ago

It would nice to enable this with a generic solution:

bind args(file: mapping.csv)
  move_field(arg1, arg2)
end

That doesn't take care of the deletions of course.

nics commented 3 years ago

added mapping fix (see #366)