compose / transporter

Sync data between persistence engines, like ETL only not stodgy
https://github.com/compose/transporter/issues/523
BSD 3-Clause "New" or "Revised" License
1.45k stars 213 forks source link

Message is noisy when no id present #29

Closed codepope closed 9 years ago

codepope commented 9 years ago

If incoming documents have no id or _id field, the extractID function in message.go will emit an error and the document in full. (line 59 of message.go). There's no way to supress this so if you are using the transporter to import un-id'd raw data with the intent of letting the target database create an id for it, you'll get a lot of errors on stdout. Options? Remove the error printing/move the error to stderr/add a bool noid argument to the NewMsg which when true, skips the id manipulation at line 39 and the generation of the error? (I prefer the latter as it would allow adapter authors to decide on behaviour).

nstott commented 9 years ago

Yeah, that's true. In The original implentation it was a requirement that each document have an _id (or id). I don't think that requirement is as strict now, and for datastores (like redis, etc), an id really isn't meaningful.

nstott commented 9 years ago

I'm trying to work out how to deal with this. Ids are currently a little magic, and there's a lot of hoop jumping associated with turning mongo's _id properties into appropriate id properties for elasticsearch / etc.

My inclination right now is that this magic should be stripped out, that id's should be treated as any other property, and rely on a transformer to move the mongo _id to the elasticsearch id (if that is called for).

@jipperinbham @mrkurt @codepope thoughts?

jipperinbham commented 9 years ago

I'm in favor of this since documents could still be looked up in ES with a query. Plus if people required the id to be the same, like you said, that's where a transformer can be used.

nstott commented 9 years ago

fixed with #37