Open dantleech opened 10 years ago
can php array keys be anything else than strings? if not, that should actually be doable. i think we would need to find a better name, as we already have multivalue with assoc=true which is used for flat key-value hashmaps. something like NestedArray?
but anyways, lets not start this before the cmf 1.1 is released.
php array keys can be integers and strings, nothing else
I would suggest an handling equal to the type setting in the JMSSerialzerBundle. It should be important to keep strings for the key to set them as localnames by default.
If the phpcr-odm can handle this type setting it should be possible to persist complete document-structures in an equal way We serialize objects to XML for example. The only difference should be: the structures have to contain Nodes as Wrapper for other properties.
I think this is a little bit more than the issue started, but goes in the same way. Cause if We start to add deeper array there can be a way of setting deeper documents instead of reference them as children. This can be a Second way.
Oh i wrote nonesense. But can not delete it at the momentfrom my mobile phone.
Thought he wants to set the deeper arrays as real child documents.
On which level would this feature be implemented? Only in the ODM, or would this already be placed in the PHPCR-Specification?
Only odm, the phpcr spec is linked to jcr. And its not a basic thing, its just syntactic sugar to automatically store data without objects.
----- Reply message ----- Von: "Daniel Rotter" notifications@github.com An: "doctrine/phpcr-odm" phpcr-odm@noreply.github.com Cc: "David Buchmann" david@liip.ch Betreff: [phpcr-odm] Support nested associative arrays (#417) Datum: Di., Feb. 11, 2014 08:05 On which level would this feature be implemented? Only in the ODM, or would this already be placed in the PHPCR-Specification?
— Reply to this email directly or view it on GitHub.
Could we not do this as a BC break? As it would otherwise overlap with the functionality of assoc=true
we could add another thing like assoc, calling it nested
and then it would persist differently, but its still an attribute field and not a children mapping in phpcr-odm terms.
Not sure mapping it as an attribute field would make sense .. i.e. @Integer(nested=true)
, @String(nested=true)
. Would we then cast all the elements as integer
or string
depending on the field type?
I think a dedicated field type would make more sense, (like Doctrines @Field(type=phparray)
)
We could then either unconditionally map all the property values as string
or dynamically cast them.
yeah you are right, a field like @Array resp @Field(type=array) sounds good. afaik we call the other thing multivalue, and multivalue is just that, while array maps to phpcr nodes.
Is it possible to get an @Field(type=<integer,string>) to get valdation on keys and values for their types? Sounds cool for Annotation, but imposible for xml config. By this it should be easy to go one Level deeper and so on.
Oh my signs are gone. Ment @Field(type=<integer,string>)
if we don't specify it, the types will be autodetermined by phpcr and come back in the right type on loading (string / integer / float / boolean)
Recently I had some issues with that when there were null values in an array (using assoc=true) forcing me to remove all null velues before persisting.
uh, i think indeed in phpcr setting something to null means removing it. so your problem was that the keys get shifted because values disapear when reloading the property? maybe we should fix persisting assoc arrays to clean null values before storing the keys and values. otherwise we would need to store the indexes that where null in a 3rd field and on restoring the array put them back in the right places. that sounds tricky, but would probably the least wtf. do you want to open a separate issue for this? can we address this in like the next couple of days to get it into the 1.1 release? there is potential BC break here, if people somehow worked around the bogus behaviour.
Does JCR store property type in the property (i.e. so something that is dropped in PHPCR) or is it the responsability of the nodeType definition?
Having metadata on the properties sounds awkward indeed.
@dantleech the type of a property is stored in the property itself (the type is just there to enforce what type a property may have). i think uwe was talking about something different - null values are simply not stored in phpcr. assuming the 'test' property exists, $node->setPropertyValue('test', null) is the same as $node->getProperty('test')->remove() . this seems to hold true for arrays as well (though only in jackalope-jackrabbit probably.)
@dbu so the type is stored with the property? Looking at a PHPCR dump I don't see any data other than the value on the property, I thought it was dynamically inferred.
If the type is stored with the property, can we not add a custom type for null
?
@dantleech look at an export in the system view, you will see the type. the types of properties are a limited set, there are no custom types and there is no type for null - the doc is quite explicit on the semantics of null: http://www.day.com/specs/jcr/2.0/10_Writing.html#10.4.2.4%20No%20Null%20Values
@dbu: I can't remember from heart but I think Jackalope or phpcr-utils throw an exception somewhere when you try to store an array with assoc=true and null values.
i can't store null values. what i get is
[PHPCR\ValueFormatException]
Can not determine type of property with value "NULL"
not an extremly helpful message, but at least it does not work to create messed up data.
Shall we do this for 1.1? I might be able to look at it at the weekend if nobody else is interested.
you mean the validation? that would be nice. for the original topic of nested associative arrays, i am -1 as this is a totally new thing with quite some effort involved, and we really need to get 1.1 out and the cmf bundles 1.1 out. we are behind the roadmap by almost 2 months...
This would also need some special handling for null values in the array, as properties with null values are not stored by phpcr, so you will loose the keys!
see #457 for how uwe solved this for the hashmaps case - the solution here would probably look similar.
Hmm. Just had a look at this and one issue is how or if we should handle translations here?
i would handle them for the whole property, as with non-nested arrays. if the property is translated, the whole thing is translated. if you use children translation strategy, there is no problem. for attribute strategy, if the idea is to create subnodes, we should use the same naming schema as for properties for the top level node of the structure.
ok, i thought the same but it struck me as being maybe non-trivial - would this open a box of refactorings do you think?
maybe the assumptions of the parameter to a translation strategy needs to be altered a bit - maybe not even that, not sure. apart from that i expect it to be rather straightforward.
I just wrote a an event listener class for SuluCMF which does this. It uses a dot-delimited property name to store multi-dimensional arrays.
Name | Type | Vaue |
---|---|---|
foo.bar.bar | STRING | Value 1 |
foo.boo.baz | LONG | 1234 |
foo.car | STRING | Value 2 |
Would give
array(
'foo' => array(
'bar' => => array(
'bar' => 'Value 1',
'baz' => 1234,
),
'car' => 'Value 2',
),
Will try and port it here.
so you use a child node for the array, but only one instead of a nested node tree? i guess that works, but why not go with nested nodes while you are at it? this scheme would break if i do
array(
'foo.bar' => 'x',
);
No everything is in the same node in this strategy.
Regarding the breaking of the scheme, if we rewrote the entire array every time it shouldn't be a problem I think.
if everything is in the same node, does that mean there can only be one array? and how do you know what belongs into the array or not?
with rewriting, you mean encode .
in hashmap keys with something else?
No, say you have a property foo
which is mapped as an array
in PHPCR-ODM.
If you set its value to array('foo' => array('bar' => 'baz'), 'car' => array('boat', 'plane'))
then the following properties would be created in the node:
foo.foo.bar STRING "baz"
foo.car.0 STRING "boat"
foo.car.1 STRING "plane"
We would know it is an array because there would be a mapping indicating that it is so.
with rewriting, you mean encode . in hashmap keys with something else?
Ah, yeah I see what you mean. We would need to encode/escape the delimiter.
ah ok. i think your first example was missing the property name, that got me confused ;-)
and then to rebuild the array, you would scan the node for all properties starting with "foo.". but be aware that properties do not have guaranteed order, only nodes can have that. this means we can not guarantee to rebuild the exact same php array. but this is no argument against your architecture, as the same problem holds for nested nodes for child arrays.
the order would only be guaranteed if we would use a single multivalue field with a list of keys and a second one with a list of values. but then it would become unsearchable again. we might use a separate multivalue property just for the order.
Hmm yeah. But is order really important with associative arrays?
In the case of non-associative arrays they would have indexes so would be orderable. I think unordered associative arrays is an acceptable compromise (although if pushed we could store the order as you say).
well, i could do a foreach on the assoc array... i think while we are at it we should make it clean, or we risk unexpected things. the separate multivalue field for the order should be ok. you could then restore the array by doing foreach on that multivalue field and get the values out of the property values array of the node.
maybe we could make it an option?
/**
* @PHPCR\Array(ordered=true)
*/
protected $options;
Storing the order would increase the node size, and this strategy is already pretty inefficient in terms of size -- although perhaps this is a moot point anyway as we need to store the keys in order to have the null
values.
Also .. this could replace the existing array implementation(s) what do you think?
(regarding the null
values we could also represent them as a string e.g. __null__
and make sure that that value is escaped when the user uses it) /cc @uwej711
i would not make keeping the order optional - keep complexity manageable. if anything, we could support either expanding the way you do, or serializing an array (if you are concerned about efficiency).
and yes, this should probably replace what uwe did, no need to have two slightly different things for the same. this however is a BC break for existing data...
can you search the same way as with the uwe-solution? right now if i had a field $f, it was f = 'foo', now it would be f.* = 'foo' right?
I guess for searching we should make it transparent (and therefore maintain BC). I guess we need to do some research about the best way to implement the search syntax/API. But I think the basic case f = 'foo'
should be the same.
in the query builder, yes (though, you could also want to search for a specific key?)
i was thinking about SQL2 queries too, there must be a representation that works.
well, I guess SQL2 is to PHPCR-ODM what SQL is to Doctrine ORM.. i.e. it breaks the abstraction so you shouldn't really use it to query for documents.
Had a look at this. Basically there is a pain point with the translation strategies because the they would need to set a property value for each element in the array instead of a single value. So we would probably have to push more responsiblity into those (although I guess it could be handled in an abstract class and we would have a "serialization" helper).
for child strategy, we need not do anything at all, as each translation has its own node. for attribute, we should probably just translate each property, if you go with the approach of splitting each array entry into its own property.
It should be possible to have an array type which automatically creates a structure, e.g
Then automatically map that to a PHPCR structure: