djangonauts / django-hstore

PostgreSQL HStore support for Django.
http://django-hstore.readthedocs.io/
Other
517 stars 142 forks source link

Is there a way to reload schema just for one instance? #83

Closed bsod90 closed 9 years ago

bsod90 commented 9 years ago

Hi django-hstore developers!

I'm wondering if there is a way to apply schema not to the whole model class, but rather for one model instance.

My situation is following:

I have a model that has a set of fields that are common for all instances. This model has a 'category'. Depending on category (e.g. Hotel, Parking, Theatre) I'd like to add a fixed set of additional attributes (different for every category). For example, Hotel should have a star_rating, while Parking should have a lots_number but it shouldn't have a star_rating. I can't create separate models for each category, I'd like it to be as much dynamic as possible.

So, the solution I see is to attach schema name to a Category model and then apply this schema to 'attributes' field on instance creation. However, if I do it in the way described in docs:

field = SchemaDataBag._meta.get_field('data')
# load a different schema
field.reload_schema([
    {
        'name': 'url',
        'class': 'URLField'
    }
])
# turn off schema mode
field.reload_schema(None)

Schema is getting updated for the whole model class (all instances).

nemesifier commented 9 years ago

hi @bsod90,

with the current implementation it is not possible to reload a different schema on an instance level.

bsod90 commented 9 years ago

ok, thanks for the information :)

honi commented 9 years ago

Would it be possible to reload the schema each time just before using the model instance?

@bsod90 I have the exact same use case. Could you share how did you manage around this issue?

bsod90 commented 9 years ago

@honi No, it's not the way to go. First of all you'll have a race condition: when two requests wants to use the same model with different types. But there're other problems too.

In our case we just decide not to use hstore for this and just define a number of Polymorphic models inherited from one base. We have ~10 different instance types, so, this should work for our case quite well.

honi commented 9 years ago

Oh, too bad. It would have been nice to be able to define those properties/fields dynamically instead of having ~10 models.

bsod90 commented 9 years ago

yeah, it would :)

nemesifier commented 9 years ago

I would follow @bsod90 approach, I was just thinking to do something like that on one of my projects, @bsod90 is your solution open source? Can I see it?

honi commented 9 years ago

What if you still have 1 model for each instance type, but instead of defining the specific fields in each instance type, you simply have one hstore.DictionaryField, which then uses a schema defined in an InstanceType/Category model?

bsod90 commented 9 years ago

@Naddiseo Unfortunately it's private. But it's very straightforward, you just have one Base model with common fields and several Children with specific fields. + add https://django-polymorphic.readthedocs.org/en/latest/ to simplify querying and admin integration. For small number of types it works quite well. And yeah, for the case when we still have to add something dynamically, we keep 'attributes' hstore field in Base model, but use it in schemaless mode.

@honi hmm. looks like a good idea :+1: but you still need django-polymorphic for constructing proper objects for each instance fetched from DB.

honi commented 9 years ago

Here is another idea: you have Product and ProductType models. Each Product has a datasheet field which is a DictionaryField that should be populated with ProductType specific information.

As said previously you can't define a schema in each ProductType and reload it in Product.datasheet each time you use a Product instance.

So instead you can construct one big schema by concatenating all ProductType.schemas. Each time a ProductType is modified (signals maybe) you can reload Product.schema.

Here comes the trick. You can add extra fields in the schema (hstore appears to ignore them). In particular, you can add to each field definition a type key. Then, Product model has a method to obtain the datasheet_fields which can be used to configure ModelForms and in template rendering.

Something like this:

class ProductType(models.Model):
    schema = models.TextField()

    def get_schema(self):
        schema = json.loads(self.schema)
        for field in schema:
            field['type'] = self.id
        return schema

class Product(models.Model):
    type = models.ForeignKey(ProductType)
    datasheet = hstore.DictionaryField()

    @cached_property
    def datasheet_fields(self):
        datasheet = Product._meta.get_field('datasheet')
        return [field['name'] for field in datasheet.schema if field['type'] == self.type.id]

def reload_schema():
    schema = []
    for type in ProductType.objects.all():
        schema += type.get_schema()
    datasheet = Product._meta.get_field('datasheet')
    datasheet.reload_schema(schema)

Of course each Product would have a lot of unused virtual fields, which could be a problem. In my case I just have a handful of products, so I don't see any issues yet.

I know I may be a bit off topic here, but I'd appreciate any input from you guys (@bsod90, @nemesisdesign), in particular if there is some hstore internal thing I'm overlooking that could get me later on.

bsod90 commented 9 years ago

@honi interesting idea. I see a small overhead when you need to fetch all types to create a combined schema, but considering the number of types is small, it shouldn't be a problem. And yeah, now you have to build all the forms manually.. not a huge problem too, but It was just nice when everything were integrating in django-admin automatically.

honi commented 9 years ago

Follow up... my solution is still working, so that's good. What I'm still figuring out is when should I reload the schema.

My current implementation is one top level method called reload_schema which does all the heavy lifting.

I'm calling reload_schema in the Product's init method. Though reload_schema checks if the product schema is None, and only in that case reloads it, otherwise when building a product list I'll reload the schema for each product > overkill.

Then, whenever the type model changes, I also call reload_schema but with a kwarg force=True, which forces a reload.

This allows my to lazy (re)load the schema whenever it is first needed or the schema changes. I tried reloading the schema in the request_started signal instead of the Product's init, but that broke any code that used a Product object outside a request-response cycle.

Is there a better place to reload the schema? I'm also wondering what was the initial use case that triggered this feature.

nemesifier commented 9 years ago

Thank you for the insights, I'm closing this to cleanup the issue list.