LibreCat / Catmandu

Catmandu - a data processing toolkit
https://librecat.org
175 stars 31 forks source link

"Flattening" records into unique rows #388

Closed jasloe closed 2 years ago

jasloe commented 2 years ago

I'm pretty familiar with the fix language but having a hard time conceptualizing the following problem. Short of attaching Catmandu output to an external data store or RDBMS, is there a way to "flatten" records into a relational table, i.e.:

MARC sources
=001  80211
=650  \0$aMusic for cellos
=650  \0$aConcertos (strings)

=001  80212
=650  \0$aMusic for violin
=650  \0$aConcertos (strings)
output.csv
001,650a
80211,Music for cellos
80211,Concertos (strings)
80212,Music for violin
80212,Concertos (strings)

or, even better, with unique identifier...
unique_id,001,650a
1,80211,Music for cellos
2,80211,Concertos (strings)
3,80212,Music for violin
4,80212,Concertos (strings)
phochste commented 2 years ago

This can be done with a marc_each and add_to_exporter trick.

E.g.

given a MARC file camel.mrc and a fix test.fix like:

marc_map("001",x.id) do marc_each() if marc_has(650a) marc_map("650a",x.test) add_to_exporter(x,CSV) end end

then you can do something like:

catmandu convert MARC to Null --fix test.fix < camel.mrc | nl -s ,

The nl​ command adds the line numbers

Cheers Patrick


From: Jason Loeffler @.> Sent: 11 January 2022 00:35 To: LibreCat/Catmandu @.> Cc: Subscribed @.***> Subject: [LibreCat/Catmandu] "Flattening" records into unique rows (Issue #388)

I'm pretty familiar with the fix language but having a hard time conceptualizing the following problem. Short of attaching Catmandu output to an external data store or RDBMS, is there a way to "flatten" records into a relational table, i.e.:

MARC sources =001 80211 =650 \0$aMusic for cellos =650 \0$aConcertos (strings)

=001 80212 =650 \0$aMusic for violin =650 \0$aConcertos (strings)

output.csv 001,650a 80211,Music for cellos 80211,Concertos (strings) 80212,Music for violin 80212,Concertos (strings)

or, even better, with unique identifier... unique_id,001,650a 1,80211,Music for cellos 2,80211,Concertos (strings) 3,80212,Music for violin 4,80212,Concertos (strings)

— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FLibreCat%2FCatmandu%2Fissues%2F388&data=04%7C01%7CPatrick.Hochstenbach%40ugent.be%7Cc83aadf8b96a4fe405f808d9d491dcbe%7Cd7811cdeecef496c8f91a1786241b99c%7C1%7C0%7C637774545200658074%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=qImDsnYgBHeVo3FToE2JcyC4N6sjydsgF%2FFBo0lUTZg%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAAQG4CG4NSEOAPODOYHNJDUVNULLANCNFSM5LU47Y7Q&data=04%7C01%7CPatrick.Hochstenbach%40ugent.be%7Cc83aadf8b96a4fe405f808d9d491dcbe%7Cd7811cdeecef496c8f91a1786241b99c%7C1%7C0%7C637774545200658074%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=r%2Fl%2FRq1hoOnMHxKNYj6g4qdTzitV5I%2BcBuuevvhr4ic%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7CPatrick.Hochstenbach%40ugent.be%7Cc83aadf8b96a4fe405f808d9d491dcbe%7Cd7811cdeecef496c8f91a1786241b99c%7C1%7C0%7C637774545200814316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=edgecH999uOARTmlfbJP93l6KUce0aY3Aqa326cq9PI%3D&reserved=0 or Androidhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7CPatrick.Hochstenbach%40ugent.be%7Cc83aadf8b96a4fe405f808d9d491dcbe%7Cd7811cdeecef496c8f91a1786241b99c%7C1%7C0%7C637774545200814316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=i1RsI1mEQjaR%2F9c3zbXV81oS8NBQVLko2dMqWdF9KDQ%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>

vpeil commented 2 years ago

Question seems to be solved.

jasloe commented 2 years ago

Solved indeed. Thanks so much!