yusefnapora commented 8 years ago

78

So far just adds a very simple Writer.update_artefact method that accepts a canonical ref and a meta dictionary, and submits an artefactUpdate cell. It doesn't try to do any translation or schema validation, etc.

I fixed up a couple things in the reader.api - we were applying the chain updates in reverse order, and we weren't merging dictionaries, so e.g. updating meta.data would overwrite any existing value there.

I'm not yet sure what the best way to expose the update command is... having this freeform method is fine (once we run it through the schema validation), but we should also support updating an existing record with new translations of external data, which is a bit trickier.

Open to suggestions on that :)

parkan commented 8 years ago

Yeah, the case that I found really interesting is reprocessing an existing metadata source w/new translator -- there's probably a neat way of doing addressing for that w/IPLD

vyzo commented 8 years ago

I think we want to have some sorts of validation here, perhaps with a user data schema?

parkan commented 8 years ago

@yusefnapora I'll update docs for this

parkan commented 8 years ago

@vyzo right now the validation happens in the translator and (given the generally permissive nature of python) I think that's an OK place for it -- it will run if you do an update based on an external source, and you could have always bypassed it easily anyway.

vyzo commented 8 years ago

fair enough.

parkan commented 8 years ago

Noting down new CLI commands (to come) from slack w/yusef:

update Qm00.. translatorname@Qm11... single_file.someformat: translate metadata from single_file.someformat and apply it to the chain of Qm00... canonical, possibly creating "satellite" canonicals (basically like ingest but with single file input and no "central" canonical creation)
update --direct Qm00.. translatorname@Qm11... < something.json: update Qm00... chain with contents of something.json directly (json needs to conform to our metaschema) +
reprocess Qm00... translatorname@Qm22...: reprocess metadata in raw_ref using a different translator and update the chain with the new translation -- need to figure out how to specify which chain cell to reprocess, and potentially add an "unfolded" results view in the client so you can find the right ref

Probably going to punt on the last one in this release, since it does involve some complexity

parkan commented 8 years ago

lgtm, good to merge?

yusefnapora commented 8 years ago

Yeah, I think we can merge it. the update-with-translator step is kind of inefficient, since it will re-add the artefactCreatedBy cell (and any other cells the translator spits out). That ends up being essentially a no-op when you fold over them if they've got the same content, but it adds a bit of overhead since you have to fetch the extra cell, and we have to store it, etc.

But avoiding that would be quite annoying; you'd have to first fetch the existing chain and compare the new cells to the existing ones to weed out duplicates. I'm fine with punting on that for the moment.

yusefnapora commented 8 years ago

actually, now that I think of it, in real-world usage, you'd probably never have 100% identical cells, because if you're retranslating (or using a completely different translator), the translator id would be different, even if they end up producing the same metadata. So I'm fine with it after all 😄

parkan commented 8 years ago

Yeah, I can live with that for now

yusefnapora commented 8 years ago

oh one more thing before I merge; I need to add the --skip-image-downloads flag to the update command parsers.

yusefnapora commented 8 years ago

oh, hmm. actually it doesn't make sense for update-direct; that one doesn't try to download images at all... still experiencing some coffee-brain latency this morning

mediachain / oldchain-client

wip - update API #81

78