Closed djorgji closed 1 month ago
Omniparser is by current design a parser - it ingests data and transforms into json. It doesn't have any writer functionalities nor does it support transforming json into other format.
That said I'm very interested in your use case and want to see, if possible for your to share, a few examples to understand what capabilities it requires if such features were to be added.
Yep, I guess our need is more of a transformer. The use case has both EDI inbound, but also outbound EDI. Data inbound in EDI that is transformed to JSON so it can be loaded it to a NoSQL DB. After some processing we need to transform the changed data back to EDI, and integrate it out.
Omniparser is schema driven when transforming data into json, during which vast majority (in fact all) of our use cases extract small amount of data from original EDI and write into resulting json.
So you can see the process "filters" down info, versus your proposed output process, a ton of EDI segs have to be written out to make it a valid EDI doc.
While theoretically we can still make it schema driven for the output process, the schema would be really dense I suspect.
I don't have a good idea how to approach this request yet. Any suggestions welcome. The key principles to hold:
@djorgji any comments/ideas? I will close the ticket soon if no new addition coming.
Closed due to lack of activities and no agreed action items on the feature request.
One way might be to do a json structure that can be traversed to export a edi file.
For example Translate EDI to JSON, then json to EDI,
So it would end up looking like this json to EDI:
{ "ISA": { "Element1" : "", "element2": "" etc} So key is the segement identifier, and then subnest that segments other data etc. I can go a little further into this, but maybe looking at the guides provided by stedi.com/ and its json configs might help to build out such a functionality.
Edit: In essence you would still get a json, but one that can be used to generate an edi file, what I realized is that edi format is pretty much like a csv file formatting wise, and each segment has a mapping of columns in a way. For example:
LIN^009^VC^22222~ Format LIN - Columns 1,2,3 {
"LIN": {
"Column1": "",
"Column2": "",
"Column3": ""
}
}
@djorgji @aKardasz Sorry for the delayed follow-up. It is not about what the mechanism to use for the json->EDI transform - there are plenty of ways/tricks to enable such transform. What I'm looking for is a complete/real-world scenario to study the use case so we can decide whether the json->EDI (or for that matter, json-> < any format > ) transform belongs to this library and if so how to design it correctly. Check back my previous comment (https://github.com/jf-tech/omniparser/issues/183#issuecomment-1319191996) for my questions/concerns.
I'm a bit inclined to close this ticket given I'm not receiving feedbacks from the original ticket issuer, but let's wait for a few more days.
Hi sorry, with year end/holidays I have been out of hand.
The use case is essentially a two way ETL between 2 incompatible systems. Most modern tooling uses JSON, so converting from archaic formats like EDI into JSON for processing or storing (example writing to a NoSQL db), and then on the way "back" needing to send it in w.e. format the older system requires. Like the commenter above mentioned stedi seems to do that (first time I hear of them), however after a quick review of their capabilities based on their website/docs, I can't possibly use it since it is API based, adds delays when dealing with multiple transactions.
We are a startup in transportation, where a lot of trading partners have very old systems utilizing EDI/SOAP/CSV. For example most carrier status updates come via an EDI 214 Transportation Carrier Shipment Status
, and we could receive batches with multiple lines. On the flip side EDI 204 Motor Carrier Load Tender
is sent to let the carrier know you have a load for them to pickup, again it can be batched if in the timeframe between cron job runs more than one load became available for that carrier.
We have SFTP folders setup for Inbound (ex: 214) where we need to go from EDI to JSON, and Outbound (ex: 204) where we need to go from JSON to EDI. The 204/outbound is not implemented yet, we were trying to understand our options (including omni parser) before trying to roll our own.
Funny you should mention other formats, couple of weeks ago we got a use case converting json to xml (SOAP), we used omniparser to do json to json transform, and then marshal go objects into xml (we could skip this step if omniparser could go directly to xml from json), to normalize the response we used omniparser to go from XML to JSON.
You have created a great tool, thank you! I actually checked with the team if any of them wanted to help, but no one felt comfortable with that saldy, I guess they are a shy bunch, personally I am an older Java guy sadly. Let me know if we can help any further.
The direction you take your project is up to you, I would love it be "omni directional", or at least bi-directional
between JSON and other formats.
@djorgji thanks a lot for your support! if you're doing this on behalf of your company, please send me an email at jf.tech.llc@gmail.com so I can grant you a company license.
And thanks for the project background. What I meant early by "real world scenario" is that I need to take a look:
The reason is (like I mentioned before): in practically all our use cases we only extract very small subset info from input EDI and filter out the rest. In such case, reconstructing a similar output EDI from the small subset of data is nearly impossible. Thus I'm curiously how your use case (with actual samples) looks like. Are you extract all info/fields out from the incoming EDI, transform into json, and then possibly unmarshal to golang objects, then looking for some way to marshal it out into out-going EDI?
Your use case of JSON<->XML (SOAP)
, that's interesting. Basically you use golang's builtiin encoding/xml
to marshal objects into XML. That's what I was thinking along that line - omniparser from the beginning was designed as an ingester, it was not designed to be an omni-writer. Once omniparser ingests (and transforms) various data formats into the standard golang fav JSON, you can simply unmarshal it into golang objects. From there, you can marshal your objects into json or XML supported by builtin encoding
package. What we lack here is maybe a tag based marshaler for EDI? Don't know for sure, need to see your example.
A follow-up note: omniparser tries to be domain agnostic - it intends to be general purpose ingester/transformer for all formats into JSON. Seems like what you're interested in is some library of pre-made EDI specific schemas (like 214, 210, 240, 805, etc) and some accompanying output schemas/specs to write the ingested JSONs into output EDIs, pretty much like what stedi.com is doing. If that's the case, I would argue these pre-made schemas don't belong to omniparser, but rather a omniparser related schema library. However, we do need to either create inside omniparser or create a new library to handle the JSON to output format writing aspect of the request.
Once again, a sample input EDI, a sample omniparser schema, and a sample output EDI would be really helpful
Hi, thanks for the replies! I am working on anonymizing some of the samples, and I will send you a couple of samples over the next couple of days to your email.
Keeping the parser agnostic to the specific X12 or EDIFACT, makes sense. We do sometimes support custom schemas, so an agnostic library would be best.
I am not sure the parser needs to worry about the completeness of the EDI output that would be up to the users not run into a "garbage in garbage out" type of situation. For example on the XML example we had to add SOAP envelope element before marshaling.
Closed due to lack of activities.
Converting JSON to EDI X12. Is this supported? Is there any example?
@tb-artomu no omniparser is a general purpose parser for data ingestion. It doesn't have output capabilities nor are such capabilities on its roadmap. That said, @djorgji and I are discussing the possibility of creating a separate library to support schema driven writer features. It's at very very initial stage and obviously no milestones and ETA yet.
Hello @jf-tech, I also do have same requirement as @djorgji. Let me know if I can be of any help. Also, if you are aware of any such solution/library I can use for the time being?
@lg-RahulYadav @djorgji I'm wondering what's the best collaborative way to make progress on this issue. Slack/discord seems bit too early for such a small project (omniparser). Wondering if we should simply start an email thread talking about requirements, intended inputs and outputs. I have sketched some initial and very very rough ideas how to do output, most likely outside omniparser, but under the same umbrella of text based input/output parsing and transformation. The trouble I have is that we in the past had no requirements doing output, thus the imaginary input->transform->output pipeline isn't very solid in my imagination, and thus hard to come up with concrete solution. What we need is actual use cases, then we can analyze how to design this.
@lg-RahulYadav,
aware of any such solution/library I can use for the time being
No, not that I'm aware of. Always check awesome-go first.
Some offline discussion notes for creating such an schema based output library:
I'm wondering how we should deal with generic requirements of writing to a structured flat file (csv, potentially XML, EDI, etc) which contains loops, hierarchies, etc.
Also wondering about the verbosity issue of such writing schema: imagine there is an input EDI, you want to do some processing about it (say standardize all the date/time fields, or do some currency conversion), and write out under another EDI spec. One way of doing it is to parse and suck in EVERY segment, element, and component data (:O, OMG!), store in an intermediate format, such as JSON, then output schema will have to specify yet again, EVERY segment, element, and component in the target output EDI file. I can imagine for any nontrivial EDI spec, such input schema (for omniparser) and output schema (for this new library we're discussing about) would be incredibly verbose and hard to maintain.
The verbosity issue is one of the reasons I'm struggling to come up with an elegant solution. The majority of our past use cases of omniparser is to distill information, i.e. extract a relatively small number of the important fields out of an otherwise large input. So essentially it's a compression algorithm, which makes the omniparser schema light and maintainable.
I'm wondering if something like this would be the future direction: creating a generic writing library based on some schema, which supports, flat, loop, and hierarchy. Then create format specific libraries on top of the core writing library. Such format specific libraries can include, say EDI specific format, in which we can create EDI specific input (yes, we need omniparser input schema as well) and matching output schema. These schemas will be dense, long, and very industry and use-case specific and maintained by each company I imagine.
For the time being, I tried implementing EDI to JSON, using templates. Because in real life case, companies choose a subset of segments for any particular format of EDI, So the solution I have been thinking of is, having file declarations same as of now we have in omniparser and fileoutput will be a template, atleast for EDI? Or As you mentioned we can have some intermediate json, and then convert that perticular json to other formats? for EDI specifically we can use templates? what are your thoughts on this?
@tb-artomu You may explore the templates method. I recently was able to convert JSON to EDI x12 using python templates. (MAKO)
I would agree on the templates bit, from my understanding communicating with the EDI, there is a lot of variation between clients, and adoption of the "standard". either way there is a lot of "always output" parts.
@djorgji true, I may be completing wrong here, So EDI follows a particular structure, taking X12 850 as an example, if we have input data filled in mentioned structure, then we can write it to X12 850 easily. If you have any idea how we can fill data in this particular format in case, we may choose other structures also?
this above json structure resembles the 850 structue of EDI (not all nodes are included). I'm able to write X12 850 using this structure and data filled inside it. But in real world, we have some details in object and then line items as array of objects. I'm trying to figure out how I can take data from here and fill it in earlier mentioned kind of standard structure.
As of now, a solution that I'm thinking is,in standard schema, we have all the components mentioned, so against each component we may details like path (from where the value will be fetched, just like xPath), type, limit etc. ? your inputs?
In the context of 'reading a custom format' with omniparser (even if the current usecase is json), are there some examples of how to efficiently 'hook' to feed into this future or custom code renderer/marshaller towards the converted file format?
i.e. just thinking IDR, old school sax eventing, or if can 'aggregate' say EDI segment groups into smaller programmatic go structure (efficiently) to then send to a renderer/marshaller in relevant domain bits? (if we can all do this streaming for example, versus multi-phase)
Old school smooks/java trying to shift to golang world, and this particular ticket is very interesting :-)
Just to throw in my 2 cents.
I am interested in something that in the EDI section can convert a lot of formats into an "inhouse" format and then export it to any wanted format - not just to JSON.
So lets assume we have to convert:
XML ==> EDIFACT
Then I would recomment, not going opver JSON but actually GOB (if just used as temporary middle-stop), so it would be:
XML ==> GOB ==> EDIFACT
GOB is Golangs alternative for JSON which has a way smaller footprint and usually outperforms JSON by a lot!
So, if:
go with GOB.
Other than this use JSON. Btw I also would be very interested in a nice very performant solution that can convert most EDI formats, with a nice easy-to-use mapping.
P.S.:
Omniparser is schema driven when transforming data into json.
Ah sh*t .. I cam here from LINK
and had this in mind:
[...] We just open sourced an small library in our internal ETL pipeline called omniparser - a schema driven / codeless parser to ingest files (many formats supported out of box) in streaming fashion and transform into desired output [...]
But it seems it just supports converting to JSON?
I would like to do:
For most of these tranformations I personally would recommend using a step in between (inhouse format) which should be GOB/JSON (depending on the data)
But I guess this is not possible out of the box?
Seeing as this issue is still open i'm throwing in my two cents:
At my workplace we have developed an EDI system that takes any kind of text file, extracts the information acording to an schema, checks business rules, validates and maps to any format, be it json, xml, edifact, x12, idoc, xlsx, etc...
This project contains over 1.7M lines of Go code, and the core packages are over 500K, so the feature discussed in this issue may be a severe undertaking.
Although i just discovered the project, I would be happy to contribute if there is renewed interest in tackling this issue, as i can see the potential to reduce the overall code base of our product using ominparser.
Sorry, everyone involved here (@alfredovaldes @djorgji @dhartford @the-hotmann @aKardasz), for not updating this issue frequently. As we mentioned before, given we don't currently have a solid use case for such a template-driven JSON -> omniwriter
project, we're not moving forward with it. That is not saying we don't want to get involved, if concrete examples would be provided. By concrete, we mean real input, real output, and details about the mapping instructions, potentially multiple pairs, so we can analyze them and propose the best way of architecting such writer.
Hi all, first of all I would say this tool is great, thanks for all that you have done. I just have a simple question if there is a way to convert json to EDI (X12), or whether something like that is on a roadmap somewhere?