jbenet / transformer

transformer - multiformat data conversion
transform.datadex.io
130 stars 7 forks source link

Finding/selecting conversions #10

Closed jbenet closed 10 years ago

jbenet commented 10 years ago

@maxogden pointed out that it's not very flexible to community contribution to use deterministic naming as the only way to use a conversion. This is particularly bad when >1 implementations of the same conversion exists. It should be possible to:

a) add new, different implementations of conversions b) let others use these conversions c) not disrupt the current users of existing conversions

So, the naming convention (transformer-<type-name> and transformer-<type-name>-to-<type-name>) is good, but should not be the only way to do it.

This is complicated further by the desired api:

var transformer = require('transformer');
var a2z = transformer('my-format-a', 'some-format-z');

or in the cli:

cat data-file-a | transform my-format-a some-format-z > data-file-z

Say converting from my-format-a to some-format-z requires composing multiple intermediate conversions (a -> b -> ... -> z). If every conversion i -> j can have multiple implementations, selecting the right one (for all) can become difficult.

For now, it can be fine to have to construct the composition explicitly:

var a2z = transformer('my-format-a', 'another-format-b', 'c', ..., 'some-format-z');
transform my-format-a another-format-b c ... some-format-z

But eventually we'd like to be able to automatically suggest + select the conversions. At some point, something like this will be possible:

> transformer --type-paths ip-address hex
ip-address buffer hex
ip-address buffer byte-stream hex-stream hex
... // listing of types

> transform --full-paths ip-address hex
ip-address ip-address-to-buffer buffer buffer-to-hex hex
ip-address ip-address-to-buffer2 buffer buffer-to-hex hex
...

(The ip-address-to-buffer2, an alternative implementation of the conversion.) And then we could find good ways of selecting conversion "routes".

jbenet commented 10 years ago

All this could get out of hand with lots and lots of conversion implementations. Ideally you'd have one conversion between any type pair and be done. But it is reasonable to assume that implementations will not be perfect and thus will need to allow other implementations to be published and used.

Maybe this is where namespacing can be useful. rather than people publishing:

and confusing users with unclear names (is foo-to-bar-fast either "a 2nd implementation of foo -> bar" or the first implementation of "foo -> bar-fast" ??). We could setup a culture of publishing:

This way, it is clear what the conversion is. SOOO maybe the npm naming really should be:

transformer.<id>[.<optional namespace/tag>]

And if a clearly better conversion rises, we can repurpose the un-namespaced package. transformer.<id> would module.exports = require('transformer.<id>.other-impl')

jbenet commented 10 years ago

Seems to have settled on this approach. can revisit later if needs to change.