mozilla / elasticutils

[deprecated] A friendly chainable ElasticSearch interface for python
http://elasticutils.rtfd.org
BSD 3-Clause "New" or "Revised" License
243 stars 76 forks source link

[WIP] Added support for declarative document types. #182

Open robhudson opened 10 years ago

robhudson commented 10 years ago

This isn't ready for merge at all but it shows a direction and code. Feedback wanted.

willkg commented 10 years ago

I haven't had a chance to look at this, yet. Maybe this month, but I'm not sure.

robhudson commented 10 years ago

I haven't had a chance to look at this, yet. Maybe this month, but I'm not sure.

Yep, no worries. It's still a ways off from being usable. I was considering setting up a wiki page to document the direction with examples, both for me to flesh it out, and for others to comment on the direction and ideas.

robhudson commented 10 years ago

I documented some thoughts on where this is going on the wiki on the fork of my project: https://github.com/robhudson/elasticutils/wiki/Declarative-mapping

robhudson commented 10 years ago

Yesterday while watching college football I updated this to support all field options documented and added lots of tests which brings the fields.py file up to 99% test coverage. I've kept all the commits separate for now but it might be more helpful to review the patch as a whole here on Github.

Next steps for me would be to support the object field type. I'm going to experiment with the idea of treating it similar to Django's relational database model, as mentioned in the wiki page (mentioned earlier).

Once that is done what might be left to get this merged? What else would you like to see here?

I'm also curious how this may play into the future plans of elasticutils. Any direction there will be helpful.

willkg commented 10 years ago

I'm really sorry it's taken me so long to get around to looking at this.

This looks good so far. My only thought on this right now is that it's not future-proofed. One thing I liked about pyelasticsearch is that it went out of its way to be future-proofed so you could always use things that Elasticsearch had just released even if pyelasticsearch didn't natively support them, yet. Here the fields and attributes are all hard-coded and there doesn't seem to be a way to allow for things we don't have support for without creating your own Field class.

I don't know offhand how often mapping type attributes change. This might be fine, but it might be a cause of pain especially if Elasticsearch adds new things on a regular basis.

I don't know if I want to do anything about this right now, but it's definitely something I think is worth thinking about and at a minimum, we should document how we expect users deal with this issue.

Another things I'm not sure what to do with is whether all of these attributes are supported for the very wide spectrum of Elasticsearch releases we're supporting. First off, we're probably doing a lousy job of supporting them all. But let's assume we're doing a good job. How does a user looking at the EU docs know what attributes are supported by the version of Elasticsearch they're using? Do we have to document everything (which is difficult because of the way they do their documentation)? Is it sufficient to document the errors the user will get when using something that's not supported by their Elasticsearch instance?

Going forward, I think we could/should scuttle MappingType and replace it with this. That probably has a bunch of architectural ramifications I'm not thinking of right now.

I think this is an important part of the next round of ElasticUtils. I've been tossing around starting a rewrite that is not backwards-compatible. If I did such a thing, we could include this work wholesale since it's pretty distinct from the rest of ElasticUtils. I don't think there's any danger in continuing work on this. I don't mean to be keeping my cards close to my chest--it's more that I haven't really been thinking much about ElasticUtils in the last few months. However, I do agree with @jezdez and others that the current state of affairs sucks and we could do better.

willkg commented 10 years ago

Talked to Rob and we (or I depending on whether I misunderstand or not) want to do the following:

  1. nix the casting code because it's interesting but the use case probably isn't compelling enough to warrant the additional code and maintenance
  2. add an open-ended kwargs for Fields so that even if EU doesn't know about that argument, we'll just pass it to ES and thus be future-proofed

After that, I think we do this:

  1. spruce up the docs
  2. tie this code into the search results code
  3. nix the MappingType class
  4. rework Indexable so that it works with DocumentType
  5. LAND IT! THROW A PARTY! MAKE T-SHIRTS! RETIRE!
willkg commented 10 years ago

Oh, and step 6 is write up issues for all the things we didn't implement in this iteration: nested something or others, object something or others, etc etc etc.

robhudson commented 10 years ago

@willkg Is this commit what you meant by open-ended kwargs? https://github.com/robhudson/elasticutils/commit/87257e6. This moves to us adding anything passed into kwargs to the mapping (minus the couple that are special).

robhudson commented 10 years ago

Updated this today based on feedback. What's left is:

I re-ordered this in the order I plan to work on it.

willkg commented 10 years ago

The "more fun" part of me wonders if we shouldn't just LAND IT! THROW A PARTY! MAKE T-SHIRTS! RETIRE! first and do the rest later.

willkg commented 10 years ago

I still want to think about this and do something, but I'm reducing the 0.10 milestone to the minimum, so I'm bumping this to the future.