dkubb / axiom

Simplifies querying of structured data using relational algebra
https://github.com/dkubb/axiom
MIT License
457 stars 22 forks source link

Data typing #49

Closed valikos closed 10 years ago

valikos commented 10 years ago

Can Axiom::Relation automaticaly convert input data by attribute map? For example

header = [[:id, Integer], [:name, String]]
relation1 = Axiom::Relation.new(header, [[ 1, 'Some string' ]])
# {#<Axiom::Tuple data={#<Axiom::Attribute::Integer name=:id type=Axiom::Types::Integer required?=true>=>1, #<Axiom::Attribute::String name=:name type=Axiom::Types::String required?=true>=>"Some string"}
relation2 = Axiom::Relation.new(header, [[ '2', :kukuruku ]])
# # {#<Axiom::Tuple data={#<Axiom::Attribute::Integer name=:id type=Axiom::Types::Integer required?=true>=>"2", #<Axiom::Attribute::String name=:name type=Axiom::Types::String required?=true>=>:kukuruku}

As we can see in the second case in relation contains incorrect data types

dkubb commented 10 years ago

@valikos at the moment axiom can't coerce data. It probably will never coerce data automagically. The general assumption is that the data it receives is in the correct type and weird things could happen if there is a mismatch.

However, it might eventually support explicit coercion from one type to another. This may even be necessary in some cases where you want to join two relations, where one has numeric strings and the other has integers. You'd coerce the strings into integers and then be able to join it against the other relation.

valikos commented 10 years ago

Thanks a lot for answer, this is very helpful for me. I just need to control my received data and I think it is all will be fine.

snusnu commented 10 years ago

@valikos for turning the data into something consumable by axiom, you could consider using ducktrap. It is ideally suited for this task.

dkubb commented 10 years ago

@valikos yeah, there's also https://github.com/solnic/coercible to handle many common kinds of coercions.

@snusnu OT, but has there been any discussion about having ducktrap delegate some of it's coercion logic to coercible?

snusnu commented 10 years ago

@dkubb iirc yes, solnic and @mbj were talking about this at some point

valikos commented 10 years ago

@dkubb I know about https://github.com/solnic/coercible, this is good project and solution. But I was hoping that the axiom will do it by default. Anyway always exists a solution. Thanks.

valikos commented 10 years ago

@snusnu I looked at ducktrap. Thanks for the tip.

blambeau commented 10 years ago

@dkubb @valikos I think that a solution to this might be to support a special coerce relational operator (under the assumption that another library such as coercible is available to do the actual coercion job). This is something I tried in Alf, and it works quite nice.

coerce applies to a relation and takes a new coercion heading with a subset of attribute names. Coercions apply in the obvious way. In the example at hand:

source_header = [[:id, Integer], [:name, Symbol]]
coercion_header = [[:name, String]]
Axiom::Relation.new(source_header, [[ '2', :kukuruku ]]).coerce(coercion_header)

The only (serious) problem I see is that theoretically, you don't always know whether keys are preserved. To be sound, you have to eliminate duplicates (just in case). You can only avoid this provided at least one candidate key is not coerced.

valikos commented 10 years ago

@blambeau axiom does not allow relation with Sumbol type

header = [[:name, Symbol]]
Axiom::Relation.new header
# => NoMethodError: undefined method `new' for Symbol:Class

And I think that your example is not working, I do not have any result

blambeau commented 10 years ago

@valikos Hmm, sure, it was a theoretical proposal for a new feature. Not something that already exists in Axiom.

For the Symbol stuff, I did not know that, sorry. It makes your original request rather suspicious IMHO, because it would require Axiom to work on data types that are not known in the first place...

Axiom::Relation.new(header, [[ '2', :kukuruku ]])
valikos commented 10 years ago

@blambeau anyway this is only example, I am sure that in a real application the recived data will not be a type of Symbol. It is only my example to check how axiom works with recived data. My goal was to make sure that the axiom automatically convert recived data by header.

dkubb commented 10 years ago

FWIW I would accept a PR that adds a Symbol attribute to axiom. There's already a Symbol type in axiom-types, so that could be used.

Also, for coercion I thought about first adding functions that could be used in Relation#extend to allow adding new attributes that are coerced to a new type. I could then build a Relation#coerce method that makes it a bit more user friendly. I'm not sure if I would want to automatically project the coerced attributes, or if it would work like #rename where only the attributes specified are coerced and everything else stays as-is, eg:

# Assumes the Symbol attribute from @blambeau's example

relation = Axiom::Relation.new([[:id, Integer], [:name, Symbol]], [['2', :kukuruku]])
coerced = relation.coerce(name: String)
blambeau commented 10 years ago

@dkubb Agreed, the Hash syntax is certainly more friendly. I did not actually suggest to project unspecified attributes away (not sure it's useful neither intuitive).

What I wanted to point out is that unless you have a mechanism to know whether a coercion is actually an injection (preserves distinctiveness), the coerce operator tends to kill candidate keys. I'm not sure it's a real problem in practice, but that suggests a few new requirements/tools in the coercion libraries themselves for keeping track of that kind of information (Symbol -> String in ruby is an injection, for instance, so in the example above the name candidate key is preserved, if any).