doctrine / mongodb-odm

The Official PHP MongoDB ORM/ODM
https://www.doctrine-project.org/projects/doctrine-mongodb-odm/en/latest/
MIT License
1.09k stars 504 forks source link

Change the documents collection name #1243

Closed maggo2801 closed 6 years ago

maggo2801 commented 9 years ago

Is there a possibality to change in 1 request the documents collection name many times?

For example:

Document class: File Annotated collection name: file

I need to change the collection name many times to get files from these collections for example. xyz.files abc.files files

malarzm commented 9 years ago

You could try playing with ClassMetadataInfo directly

maggo2801 commented 9 years ago

I did it, but if I changed it the second time the collection in query was unchanged. (In one request) Is there a caching mechanism?

malarzm commented 9 years ago

Not sure, the best would be to see a failing test case to be able to see what you're doing and what's happening behind the scenes (although it's not something we may want to support since documents could be saved to wrong collections after such tweaks I believe). On the other hand you could annotate your File document as mapped super class and have 3 classes extending that each using different collection.

maggo2801 commented 9 years ago

This ist the code how I change the collection name.

  $dm = $this->getDocumentManager();
  $test = new File();
  if($this->course != null) {
    $collectionName = 'course.' . $this->course . '.' . $dm->getClassMetadata(get_class($test))->collection;
    $dm->getClassMetadata(get_class($test))->collection = $collectionName;
  } else {
    $collectionName = $dm->getClassMetadata(get_class($test))->collection;
    if(strpos($collectionName, '.')) {
      $collectionName = explode('.', $collectionName);
      $collectionName = $collectionName[count($collectionName) - 1];
      $dm->getClassMetadata(get_class($test))->setCollection($collectionName);
    }
  }

The $this->course can be different and is not fixed. I used the dot notation to group Files for every xyz course. There is a collection without dots, this is for all courses, then I need to fetch course specific files and global files too, in the same time.

malarzm commented 9 years ago

I think you would be better querying each collection manually using underlying doctrine/mongodb and later hydrating data into objects by calling hydrate manually.

maggo2801 commented 9 years ago

Thank you for your help!

A had a last question. How can I convert an Entity to an array to persist it with MongoCollection::insert() or MongoCollection::update() ?

malarzm commented 9 years ago

Most bulletproof will be having your own toArray method on an Entity however you could try employing PersistenceBuilder - prepareInsertData should work like a charm, but prepareUpdateData may require merging your document into document manager before changing it since method relies on calculated change sets.

maggo2801 commented 9 years ago

Is there a possability to integrate these functionality in future?

malarzm commented 9 years ago

Personally I'm :-1: as that would mean document can belong to more than 1 collection which means that probably you would have to pass collection to all find method (which means complicating not only our internal API but also public one) and many more I haven't thought of.

Unless you have some neat ideas on how could we support this?

maggo2801 commented 9 years ago

My first idea was to add in the Query object a prameter named $collectionPrefix.

malarzm commented 9 years ago

Retriaged issue as "idea" :) @alcaeus also seem to have some thoughts on this.

As for $collectionPrefix we'll also need to store collection of origin for each document and take that into account during save, but what about inserting documents?

maggo2801 commented 9 years ago

As an optional parameter for merge() and persist(). If null value, no prefix is needed and the original collection name used.

alcaeus commented 9 years ago

In MongoDB schema design it is fairly normal to denormalize your database. This can range from embedding referenced documents to even storing entire documents multiple times in different collections in order to avoid costly lookups. For the second use case, this would be quite interesting, albeit in a different way: a document would live in multiple collections; any write operations would have to be duplicated to all of them while it doesn't matter where the document is read from.

@maggo2801 Out of curiosity, what do you use this for? Do you store the same type of document in different collections depending on type or other criteria?

maggo2801 commented 9 years ago

@alcaeus I used it to group things from the same type for different courses. The reason for these is at first to export/import all information from every course bundled. Secend is to distinguish between global information for all courses and course specific information.

alcaeus commented 9 years ago

Thanks for the info. Any particular reason why you split the data into multiple collections instead of using a field to keep different courses apart? Splitting data into multiple collections seems a very special use case, that's why I'm wondering.

senkal commented 8 years ago

@alcaeus @malarzm If the topic is still relevant, I have potential use case where I went to lower MongoClient level and did the hydration on my own. I was storing data per days. I stored it this way because I was querying them as well per day. Queries were really fast because data was balanced nicely between collections(got few gigs of data each day). Also my data housekeeping was extremely fast and easy because I could drop old collections for almost no cost instead of removing tons of records at night. With one big collection, the option for me would be sharding- quite expensive thing to do (extra server/replica sets). With per day collection solution, I could manage ~5 more data than I did, without even considering sharding.

Probably I could still, with more effort, get similar speed with one big collection, but at that time I was quite happy with the results anyway.

alcaeus commented 6 years ago

I'll close this for now. While it makes sense to denormalize data (for example, have one main collection containing all items and many more collection containing subsets of that data (Twitter used to do this or still does)), it requires more work and is currently not prioritized as it requires every database operation to be run multiple times.

If your only goal is to split up data into multiple collections, my suggestion would be to not do that: use views instead (even though they can't be used directly) or use proper indexing, sharding, discriminators, etc.

luishdez commented 6 years ago

Hi, @alcaeus I do have the same use case.

In my company, we have a CMS/CRM that creates types of items and properties from a UI "since is managed in UI from clients, mapping in code is no possible". Eg: You can create from the UI a type, Products, Warehouses etc.

Currently, we store everything under CollectionItem and we discriminate with a class property type.
So mostly everything has a condition equal … type = 'xxx' Performance is pretty fine using proper index definition but we have some problems.

So basically having mixed types in one collection is giving us a lot of management problems.

It will be ** awesome if we could do something like $qb->overrideCollection('xxx') or similar.

I hope you reconsidered this feature! It will be great :)

jmikola commented 6 years ago

@luishdez: The problems you've listed above seem like they would be addressed by defining a mapped superclass and using collection-per-class inheritance rather than single-collection inheritance.

luishdez commented 6 years ago

@jmikola Not sure if I'm missing something.

TIme ago we checked the inheritance options and If I understood right @InheritanceType("COLLECTION_PER_CLASS") will use the name of the class that we use with extendeds That implies that we have an existing class in the code or at least defined. But as I said before those "class" types are generated in a UI by end-users. So I'll have to create a virtual class and then map with doctrine in runtime. And for now Collection replace in metadata seems easier.

btw: We can't use hook/events per "query" Eg: 'collectionPreFind' since the class Collection that holds the final name of the collection is protected and there is no getter…

FindEventArgs {#2646 ▼
  -invoker: LoggableCollection {#2645 ▼
    #database: LoggableDatabase {#2604 ▶}
    #eventManager: ContainerAwareEventManager {#192 ▶}
    #mongoCollection: MongoCollection {#2623 ▼
      +db: MongoDB {#2619 ▶}
      #name: "Setting"
      #collection: Collection {#2644 ▼
        +"collectionName": "Setting"
        +"databaseName": "crm_321"
        +"manager": Manager {#362}
        +"readConcern": ReadConcern {#2399}
        +"readPreference": ReadPreference {#2642}
        +"typeMap": array:3 [▶]
        +"writeConcern": WriteConcern {#2643}
      }
      #readPreference: ReadPreference {#2642}
      #writeConcern: WriteConcern {#2643}
    }
    #numRetries: 0
    #loggerCallable: array:2 [▶]
  }
  -query: []
  -fields: []
}

Basically, everything within invoker is protected. if help if we could edit there the collection.

jmikola commented 6 years ago

@luishdez: It looks like I wrongfully assumed that you were using an inheritance strategy due to mention of "discriminate" in:

Currently, we store everything under CollectionItem and we discriminate with a class property type.

That sounded very much like single-collection inheritance at first read. I also missed that the following was allowing for creation of arbitrary class/types:

We have a CMS/CRM that creates types of items and properties from a UI "since is managed in UI from clients, mapping in code is no possible". Eg: You can create from the UI a type, Products, Warehouses etc.

If the UI allows new types/structures to be created on an ad-hoc basis, it's not clear to me how you can reliably index this (even without using ODM). Once a new type is created through the UI, you'd possibly then have to go and create a new index for it.

As @alcaeus mentioned in an earlier post, this seems like a use case for views. I'm not sure if those can work with ODM (possibly with read-only, mapped documents) but it should ease some of the complexity you're facing with Compass and other MongoDB frontends.

luishdez commented 6 years ago

@jmikola yeap sorry that explanation wasn't very clear.

If the UI allows new types/structures to be created on an ad-hoc basis, it's not clear to me how you can reliably index this (even without using ODM). Once a new type is created through the UI, you'd possibly then have to go and create a new index for it.

Well, that's another concern. If an index is required for common queries it can be created in the moment of the "type" is defined within the app UI. It can easily achieve without problems.

We already use views but don't solve all the problems. Same thing as indexes they should be created for every type and it's pretty new (since v3.4) so not full support… :(

The main problem is the persist moment. The thing is we couldn't find an easy way just to change the collection. Not even with "events" too protected and editing main metadata messes with the cache. So it should be done closer to the Wrapper …

Maybe at least we could add setCollection or getCollection methods to open the access? I could try some ideas in a fork and see how it goes.

I know that these cases are not common but sometimes are required.

Btw thanks for the help. 😊