doctrine-extensions / DoctrineExtensions

Doctrine2 behavioral extensions, Translatable, Sluggable, Tree-NestedSet, Timestampable, Loggable, Sortable
MIT License
4.03k stars 1.26k forks source link

Add possibility use other hierarchical structures models (differs from NestedSet) #1

Closed Koc closed 12 years ago

Koc commented 13 years ago

There are some other good models for store hierarchical structures, like closure table http://www.slideshare.net/billkarwin/sql-antipatterns-strike-back - slide 68. I think that tree-extension may support differ models.

l3pp4rd commented 13 years ago

Yes that is a good Idea, I will check the possibilities, maybe where will be easier to implement adapter pattern for those diferent Tree types, thanks for noticing

stof commented 13 years ago

I don't know if using adapter would work: NestedSet trees allow to choose the order of the childs which is not the case with the others solution presented there. I agree that other tree models could be usefull. But they should perhaps be implemended as others extensions instead of putting them in the Tree extensions (which should then be renamed to NestedSet).

l3pp4rd commented 13 years ago

Hi, stof, thanks for advice, that ofcourse will be concidered before implementation. If it is not possible to create an adapter in good looking way, when we will concider an implementation of diferent extension.

Koc commented 13 years ago

also there is not completed extension https://github.com/guilhermeblanco/Doctrine2-Hierarchical-Structural-Behavior , maybe it helps you to organize adapters

stof commented 13 years ago

The problem with this extension is that the Node interface contains a set of methods (the last ones) which are only relevant for the NestedSet model. I can't see how they would be implemented for AdjacencyList model for example.

An other problem is that the ClosureTable model requires another entity to store Closure. This will require the user to create this entity for each entity using this model because of the foreign keys. I don't think that this model could be implemented properly by storing all closures in a single table (which is made for translations).

l3pp4rd commented 13 years ago

yes, you`re right stof, all these things can be hard to implement and or maybe not possible in clean way. When I have time, I will check these possibilities of implementation.

stof commented 13 years ago

I think that the implementation of the AdjacencyList model is useless as it is only a OneToMany bidirectionnal association on the entity which is very easy to implement with the ORM. The PathEnumeration model should be easy to implement as an extension but I think it would be much more difficult to implement it as an adapter for the Tree extension. The ClosureTable model is the bigger issue IMHO

Koc commented 13 years ago

http://www.slideshare.net/billkarwin/models-for-hierarchical-data

l3pp4rd commented 13 years ago

closure tables is worth a shot, since in Doctrine2 there can be a clean implementation done. Currently my spare time is very overcrowded, but I think I'll come up with something :)

Koc commented 13 years ago

Hello!

Any news?

l3pp4rd commented 13 years ago

Hi Koc,

I have done some research and experimental implementation. In result I think I will start doing the ClosureTree since it can be used in ODM also. If possible it will be in one package and will have adapters like nested and closure. It is possible that in a month or so it can appear

Koc commented 13 years ago

This is really cool, thank you, will wait for it.

stof commented 13 years ago

What would be the way to use nested in an entity and closure in another with the current way to handle the tree strategy (when the closure will be implemented) ? Registering the TreeListener twice (one for each strategy) or is it handled by a single listener ? My concern is how my bundle configuration should look to configure the tree listener: a boolean or an array of strategies.

Btw, does the tree listener works for MongoDB currently ? It seems that the strategy is missing. Thus the namespace used in the MongoDB tree listener should be Strategy\ODM\MongoDB (just as the listener is in ODM\MondoDB) to allow a further implmentation for couchDB when Doctrine will have it.

l3pp4rd commented 13 years ago

In current architecture, example ORM Tree listener can handle any orm tree strategies otherwise you get an exception on trying to use unsupported strategy in case if it belongs to odm or is not implemented in orm. In your case you will not need to change anything, all strategies will be able to run in single listener and is identified through anotation: @gedmo:Tree(type="nested")

Currently nestedset cannot be used in ODM since it cannot support transactions and wrap few atomic update operations in single executive operation. MaterializedPath method is considered best for document based dbs.

I've started some initial Closure tree implementation on ORM and now considering how to implement it in ODM there are a lot of things to think about here :)

Related to \ODM\MongoDB namespace, you are right, it will be adjusted.

comfortablynumb commented 13 years ago

Hi! first of all, thanks for this great extensions. Nice work, really :)

I'm really interested in your closure table implementation too. I was going to try to implement an extension for this in a few weeks for my app, until I saw these comments. If it's possible, it would be great if you create a branch with your current implementation. I'd like to take a look at it and see if I can help.

And again, thanks a lot! great work :)

l3pp4rd commented 13 years ago

Hi, thanks,

OK, I will create this branch in few days. I've currently made only inserts handling, and was starting the update handling of nodes, but it is more complicated and I need also to check if one atomic update will be enough for that, because in other case it cannot be supported in ODM

comfortablynumb commented 13 years ago

Hi,

I worked on the closure table strategy today. Specifically on the update and remove behaviour. It still needs work and optimizations, so I'd like to know your opinion on what problems could still be with my implementation. Check out the unit tests too.

https://github.com/comfortablynumb/DoctrineExtensions/tree/closure-table

Thanks!

comfortablynumb commented 13 years ago

BTW, what about adding a new field to the Closure Table: level, which would be the real level on the tree. "depth" shows the distance between the ancestor and the descendant. Having a "level" field too would be useful.

And we could add a "childrenCount" field (or something like that) to the entity implementing the Closure Table, to know the number of children it has. I'll analize how much painful is to add this functionalities, but I just wanted to comment this out to know what do you think about this.

Thanks again.

l3pp4rd commented 13 years ago

Currently everything is in experimental stage, I agree to everything what can be useful, it can change fundamentally, for example closure entity can have an id, I think DDC 117 was merged to master and now can support @Id on foreign keys.

l3pp4rd commented 13 years ago

Your solution seems good, I definitely will check it more closely and give you a feedback.

comfortablynumb commented 13 years ago

Hi l3pp4rd,

Did you have time to check this out? I'm already using it on one of my projects and it works good. I have a couple of things I'd like to add to it though.

Let me know what you think about it

Thanks.

l3pp4rd commented 13 years ago

Hi Gustavo,

Yes I have checked the functionality, everything seems to be working great, there are things to improve but that can be done later. We will need to add more functions to repository. But in general annotations and everything seems great. Also I'm thinking on query for descendant removal, maybe later we will find a better way, cause now it takes very long to remove those nodes, I think on large trees moving branch can be very painful. I appreciate your work and waiting for pull request.

Great work, thanks for making extensions better

comfortablynumb commented 13 years ago

Yes, I agree in everything you said. Things definitely needs more work. I have also other things to add that I've found useful while working on one of my projects.

I'll keep working on this. In the meantime, I'll send the PR.

Thanks!

mtotheikle commented 13 years ago

I am currently working on a materialized path implementation for MongoDB. Code can be found at https://github.com/Funsational/DoctrineExtensions/tree/feature/tree-odm-support, will hopefully be making good progress as week goes on.

l3pp4rd commented 13 years ago

@mtotheikle yes I have seen, great, its one thing that is missing mostly in the extensions, tree support for mongo. If you have any questions related to internal functionality or something specific please don't hesitate to ask

l3pp4rd commented 13 years ago

@mtotheikle I see you are doing a lot of interesting stuff, I'll have a look this evening

l3pp4rd commented 13 years ago

when I have free time, I will refactor the closure tree implementation, since it is very slow and have issues with foreign keys which makes it untestable also. Another thing to consider is a relation mapping to closure entity

mtotheikle commented 13 years ago

@l3pp4rd Cool. Let me know what you think. There is still lots of work to be done with it, but it is getting there.

l3pp4rd commented 13 years ago

@mtotheikle it seems cool, but I noticed few things which should be discussed.

These are just advices which would ease further changes and features of the tree.

mtotheikle commented 13 years ago

@l3pp4rd

l3pp4rd commented 13 years ago

Sluggable allows to use different transliterator, which gives an object being slugged as a reference also. This can be used as extension point to load a path for an object and transliterate it. Your transliterator could be injected into sluggable during the flush, and handle the specific object class transliteration differently, other objects could use default behavior still. I understand that path is very cool and everything but it should not be a default behavior. Since id is unique to any object and the path from id could be used using simple urls like: example.com/node/{id} or in a fancy way user defines an annotation pathSource or something to have fancy urls like: example.com/food/berries/strawberries I'm not saying that you should quit making sorting functionality, but just keep in mind that introducing new features to the code which is not stable yet may lead to headache :) even with unit tests

comfortablynumb commented 13 years ago

Hi,

Talking about the closure implementation, I'm trying to do some tests to determine where could be some performance optimization. Which tests did you run to compare both implementations? I tried to use a test which is commented: "testHeavyLoad" on the ClosureTreeTest file, and having xdebug enabled I've got the exception for too much recursion. After doing a trace the problem was that the fixture for the Category entity was using the cascade="persist" option on the "parent" property. Removing this option fixed indeed the exception.

I'm reviewing the code to find out if I can help with the performance issue. If I find something I'll let you know.

l3pp4rd commented 13 years ago

ok, great

mtotheikle commented 13 years ago

An update on the materialized path implementation. Still working on it in my branch but I have come down with a nice cold or something so I have been under the weather a little. I am hoping to continue developing this closing to the end of this week and then start to get some documentation and examples created next week.

l3pp4rd commented 13 years ago

hey, do not rush, you're doing a great job and it does not have any deadlines :) so just get well and do not worry about it.

l3pp4rd commented 13 years ago

So far I'm very disappointed with closure tree performances, its clear that inserting and removing table records is far more slow than atomic updates

# closure tree
1570 - inserts took:    --> 01:41
moving took:        --> 00:28
Time: 02:10, Memory:    103.00Mb

# nested set
1559 - inserts took:    --> 00:25
moving took:        --> 00:00
Time: 26 seconds, Memory: 17.75Mb

Trees were verified in correctness

mtotheikle commented 13 years ago

@l3pp4rd I am guessing that is with the ORM implementation? I would think that Mongo would provide way better results.

l3pp4rd commented 13 years ago

I think that closure table is not a right strategy for NoSQL databases, if you change node parent, closures needs to be updated (some deleted, some inserted). In ORM you can wrap it in a transaction, but in NoSQL there are no such thing to wrap atomic deletes and inserts. In this case you cannot ensure data integrity. Materialized path, can use atomic update operation to sync the paths, thats all it needs to be integral.

markdehaan commented 13 years ago

Yesterday I send the following email to mtotheikle:

At this time I need a materialized path strategy ORM style. So I’d like to implement one for DoctrineExtensions. I noticed your work on DoctrineExtensions for adding a materialized path strategy (ODM style). Your implementation is based on a urlizer, which is fine I guess for ODM (I don’t have any experience with MongoDB), but is not the ideal solution for ORM style in my opinion. A more general / abstract solution like the one used in django-treebeard is probably better (for ORM). With this solution, there is no need for TreePathSource and TreeSort, so one can build a hierarchy even without the need of a urlizable field.

It feels awkward to build another materialized path strategy which is better suited for ORM. I think it’s better to build one strategy that both fits ORM and ODM well. There are two possibilities (maybe more):

- Build a new strategy ‘materialized’ according to the django-treebeard solution. In the long term both materialized and path strategy can implement each others solution and eventually merge into one strategy.

Mtotheikle replied:

Implementation for materialized path in ORM is on the todo list and I agree with that much code should be shared between the two as possible. I have not seen the django-treebeard so I will have to take a look at it. As for how todo this, I am not too sure yet. My plan was to get the ODM part working first with some documentation and then improve upon it and add an ORM implementation.

How the path is setup is defiantly on my list of things to change. That annotation will be optional and if its not by default it would use ID's in the path. The reason it is using paths like it is right now is because then url matching is extremely fast which is why I'm building this implementation right now, for a simple category system. So the path field and how you want to generate it all depend on how you want to query for your nodes IMO.

I am hoping to get some documentation and code clean-up changes done this week and then I will be submitting a pull request. At that time I will be limited to what I can contribute, but if you have any ideas as far as code I would love to see some changes implemented and can help as I am available. I have all the tests passing, except two which are incomplete, so if you want to make changes to make this more universal and reusable in ORM and ODM, that would be awesome. I think this could easily be done and I can start to lay the ground work by implementing a factory like pattern for easily switching between the ORM and ODM but I am not sure how much code will be shared between the two.

Today I noticed this discussion, so I think it's better to shared it here. I currently thinking more towards implementing another strategy "materialized". About 3 years ago I implemented a materialized path behaviour in Ruby with the same principles as django-treebeard and it worked very well.

l3pp4rd commented 13 years ago

It's great to see such discussion, I think it may be easier to implement both materialized path without sharing the logic at first, and later DRY it up by using common logic. Since @mtotheikle has done a lot of work allready, it may be refactored later by him, when he will know which parts to move to shared logic. Thanks for your input, I think the ORM strategy for materialized path is also very useful in some cases. And having a Tree extension which is able to handle different strategies even for single manager is also very awesome

markdehaan commented 13 years ago

I just started working on a ORM strategy for materialized path. I never worked with git. I've always worked with svn, but I think it's easy to pick up. At the moment it's very busy at work so progress will be not too fast I think. Code can be found at https://github.com/markdehaan/DoctrineExtensions/tree/materialized-path

l3pp4rd commented 13 years ago

cool, I think you will enjoy git and soon discover that svn is a nightmare comparing to git :)

Koc commented 13 years ago

Just started writing entities for closure tree in one project (I'm using master branch but not the latest, because haven't updated to sf2b2). And have discovered some questions: 1) Why AbstractClosure::$id is private? I have an error (no PK) when extended this class. @stof marked $id as protected. 2) I've running doctrine:schema:create command and got same sql dump:

CREATE TABLE territorial_area(
  id INT NOT NULL,
  parent_id INT DEFAULT NULL,
  title VARCHAR(255) NOT NULL,
  kind VARCHAR(255) NOT NULL,
  INDEX IDX_9385BB4C727ACA70 (parent_id),
  PRIMARY KEY (id)
) ENGINE = INNODB;
CREATE TABLE territorial_area_closure(
  id INT AUTO_INCREMENT NOT NULL,
  depth INT NOT NULL,
  PRIMARY KEY (id)
) ENGINE = INNODB;
CREATE TABLE territorial_city(
  id INT NOT NULL,
  PRIMARY KEY (id)
) ENGINE = INNODB;
CREATE TABLE territorial_country(
  id INT NOT NULL,
  PRIMARY KEY (id)
) ENGINE = INNODB;
CREATE TABLE territorial_district(
  id INT NOT NULL,
  PRIMARY KEY (id)
) ENGINE = INNODB;

ALTER TABLE territorial_area
        ADD FOREIGN KEY (parent_id) REFERENCES territorial_area (id) ON DELETE CASCADE;
ALTER TABLE territorial_city
        ADD FOREIGN KEY (id) REFERENCES territorial_area (id) ON DELETE CASCADE;
ALTER TABLE territorial_country
        ADD FOREIGN KEY (id) REFERENCES territorial_area (id) ON DELETE CASCADE;
ALTER TABLE territorial_district
        ADD FOREIGN KEY (id) REFERENCES territorial_area (id) ON DELETE CASCADE;

As you see there aren't columns like ancestor and descendant in table territorial_area_closure. Is ot ok?

stof commented 13 years ago

@Koc AFAIK, private properties work well in Doctrine, even in mapped superclass. The only issue is the EntityGenerator which is buggy on this point. This is the only reason why I changed it in the bundle.

l3pp4rd commented 13 years ago

Hi @Koc, first of all, it would better to open a separate issue. Second, have you tried running unit tests, I think they are working well with the private properties and mapping. Third, extensions are still compatible with old annotation reader if new one is not injected. Fourth, are you sure tree listener is triggered on metadata reading, it looks like it is not, if you use APC cache or something like that, clear it. My guess is that treelistener is not attached

stof commented 13 years ago

@l3pp4rd As I said, the only issue is a bug of the EntityGenerator with private properties in the parent class.

@Koc the issue should be opened on the Doctrine issue tracker as it is an issue in Doctrine, not in the extensions.

l3pp4rd commented 13 years ago

By the way, once and for all I will explain why I use private instead of protected. First, protected properties are visible to extended classes which looks like an extension point and because of visibility has to be processed for each child class when reading metadata. So unless you want to intentionally let user modify the mapping of your supper classes, it is unnecessary processing which may introduce complexity and maintainability issues. In some cases I use protected properties if I intentionally want user to be able to increase the string length or something like that.. For closure table its the reason because of unmapped properties, which has to be visible for the listener in order it could map it.

l3pp4rd commented 13 years ago

@stof I hope Doctrine will fix these problems soon, it causes a lot of misunderstanding for users :)

stof commented 13 years ago

@l3pp4rd I guess the EntityGenerator is not a top priority as @beberlei don't like it and would not even have written it if he were lead at this time.

l3pp4rd commented 13 years ago

sadly it is not a matter if someone likes it or not, now it is a maintainability reason, and all those small issues builds an opinion about the product. In fact I have never used it, but new-comers usually see it as a cool feature and wants to try it and in case if it brings errors it may lead to disappointment.