Open EstherArens opened 8 years ago
From what we talked about in the session I understand the data you start with looks something like:
Order ID: 123456
Order number: PO03453
Order date: 2015-11-01
and
Order ID: 123456
Title ID: 98765
Is that about right?
Thanks for picking this up again, Owen. Really appreciated ☺
My first/main file (orders2migrate.txt) has records of the following structure * DOCUMENT BOUNDARY * .ORDR_ID. |aPO-95 .ORDR_LIBR. |aDWL .FISCAL_CYCLE. |a9899 .VEND_ID. |aHOU .ORDR_TYPE. |aFIRM .ORDR_DATE_READY. |a19980818 .ORDR_DATE_MAILED. |a19980826 .ORDR_DATE_CLAIM. |a19981211 .ORDR_DATE_CANCEL. |aNEVER .ORDR_XINFO_BEGIN. .FORMAT. |ahbk .FUNDCODE. |aP .ORDR_XINFO_END. .LINE_ITEM_BEGIN. .LINE_ITEM_ID. |a7504324004 .LINE_UNIT_PRICE. |a£56.95 .LINE_VEND_CURR. |aSTERLING .LINE_EXCHANGE. |a1.0 .LINE_COPIES. |a1 .LINE_DISCOUNT. |aTABLE .LINE_EXTEND_PRICE. |a£56.95 .LINE_MULTI_PARTS. |aN .LINE_RENEWAL_DATE. |aNEVER .LINE_REQ_USER_ID. |a0750854256 .LINE_DATE_MAILED. |a19980826 .LINE_XINFO_BEGIN. .ITEMTYPE. |aNORMAL .NOTE. |aOriginal Libertas order P-440886 .LINE_XINFO_END. .FUND_SEGMENT_BEGIN. .LINE_FUND_ID. |aMICROB .LINE_FUND_COPIES. |a1 .FUND_DATE_PAID. |a19981016 .FUND_AMT_PAID. |a£56.95 .FUND_SEGMENT_END. .DIST_SEGMENT_BEGIN. .HOLDING_CODE. |aM-NORMAL .LINE_DIST_COPIES. |a1 .DIST_DATE_RECEIVED. |a19981013 .DIST_DATE_DELIVERED. |a19981013 .DIST_PACKING_LIST. |aPACKING-676 .DIST_SEGMENT_END. .LINE_ITEM_END.
The other file (keys2insert.txt) has one line per ‘record’ i.e. 3540638172 |1|DWL|9899|PO-95|
The common identifier in both is the order ID (here “PO-95”). Via this I need to take the values of the first two columns of keys2insert.txt and insert them into the matching record in orders2migrate.txt. And these two values should be preceded respectively by .CATALOG#. |a .ORDERLINE_KEY. |a
Caveat: Not all orders (records in the first file) still have title keys & order lines (row in the second file) – these are mostly historic orders that could never be supplied and for which the title has since been removed.
Although I’ve used OpenRefine today (already useful to just look what is in a data set ☺), I’ve not tried it for this – any hints gratefully received. Thanks in advance, Esther
From: Owen Stephens [mailto:notifications@github.com] Sent: 30 November 2015 22:52 To: LibraryCarpentry/week-four-library-carpentry Cc: Arens, Esther A. Subject: Re: [week-four-library-carpentry] Lookup between two data extracts (#6)
From what we talked about in the session I understand the data you start with looks something like:
Order ID: 123456
Order number: PO03453
Order date: 2015-11-01
and
Order ID: 123456
Title ID: 98765
Is that about right?
— Reply to this email directly or view it on GitHubhttps://github.com/LibraryCarpentry/week-four-library-carpentry/issues/6#issuecomment-160787269.
@EstherArens probably a few approaches, but the first one that springs to mind:
value.match(/\.[A-Z_]+\.\s*(\|[a-z0-9])(.*)/)[1]
value+"~.CATALOG#. |a"+cells["Content"].cross("keys2insert txt","Column 5")[0].cells["Column 1"].value+"~.ORDERLINE_KEY. |a"+cells["Content"].cross("keys2insert txt","Column 5")[0].cells["Column 2"].value
At this point, you should have a line based file with the two new data pieces added. Remove the additional column you added, then export as some kind of delimited file (no delimiter will be used as only one col)
Does that make sense?
I need to insert a title identifier into an extract of order lines from a library system. An extract of order line data gives me an order ID as unique identifier, and for each order ID an additional extract can give me the title key. How to best?