linkedin / pyexchange

Python wrapper for Microsoft Exchange
Apache License 2.0
153 stars 93 forks source link

E-Mail Support? #44

Open fizxmike opened 9 years ago

fizxmike commented 9 years ago

Our institution disables SMTP functionality on our e-mail accounts, forcing us to use their web client or some other native Exchange client (mobile or otherwise). We want our web apps to send out automated e-mails through our institutions exchange system as local e-mails are secure.

So, is there any plan to add E-mail/Inbox functionality to pyexchange? Also, how much effort do you estimate is required if I were to want to extend your code to do this?

catermelon commented 9 years ago

Hey there,

I have no plans to add in support for email (the current state of things works for us), but I'd be more than happy to get pull requests on it!

For difficulty, I would say it's not hard, but it's a little tricky. The library takes the approach of using raw XML instead of SOAP (since python soap libraries are all terrible), so the difficulty is really in 1) figuring out what XML to send to Exchange and what you're going to get back and 2) serializing/deserializing XML to objects and vice versa. That last part isn't too bad, actually, you can see what we do for calendar stuff - it's mostly a lot of XPath.

Microsoft has pretty fantastic documentation on it's SOAP responses, so it's a matter of combing through those and figuring out what you need. Here's a test script I use to talk to Exchange and experiment:

https://gist.github.com/trustrachel/b70d79ea670f048ed165

Once you know the XML request/response formats, the rest of it is pretty straightforward.

Are you going to be at PyCon? I'll be there and would be happy to chat about it in person if you are.

nicklasb commented 9 years ago

Hi @fizxmike, @trustrachel ; I will start to at least look into e-mail support in a while, I could also use that kind of functionality in a couple of months, I'd be happy to try and contribute if I can.

nicklasb commented 9 years ago

Hi @fizxmike, @trustrachel, I have started implementing some stuff, It with regards to difficulty it seems to be pretty straightforward (as soon as one's tired brain realized how lxmls builder works, duh), and I need to ask some questions:

  1. There needs to be some refactoring done to make room for messages, for example, the get_item(s) should really be called get_calendar_item(s). Are there any externa usages that I am not aware of, outside of this project, that use get_item directly?
  2. The tests feature canned responses, is there some nice logging function built-in that I can use to output those or do I have to do that manually?
  3. Wouldn't it be nice to rename tests/fixtures to test/fake or something else? Kind of confusing with an import from fixtures of ..fixtures.
  4. Would anyone mind some PEP 8 reformatting being done?

And @fizxmike, I you are about to also look at this, it would be great if we could cooperate about it? Can I e-mail you on the mail you have here on github?

With regards to the estimate you asked @trustrachel about, it is difficult to say, but I'd say 10-15 hours if one knows one's way around the library. E-Mail handling seems very similar to calendar handling. I have more difficulties understanding how the unit tests work.

nicklasb commented 9 years ago

Actually, I would like to start with a complete PEP 8 overhaul, as PyCharm goes so nuts with the coloring and indentation warnings that I sort of have problems seeing the forest for the trees.

nicklasb commented 9 years ago

So to get things started I just tested so that is works, and it does, I put the simplest possible test in a gist here: https://gist.github.com/nicklasb/a7986731ab15a6809f0b

Just copied the code into there.

I run that code in the root of the repo against my server. You will have to change the item Id so something from the find_item() response, of course.

One difference is that I don't have a real certificate on the exchange server, so I have to add verify=False in pyexchange/connection.py: response = self.session.post(self.url, data=body, headers=headers, verify=False) for it not do go bananas.

I agree with the proposal to make certificate check a setting, by the way. It does not add to anything to force verification, except hassle for developers that wants to try it out.

nicklasb commented 9 years ago

BTW, I will need your response on the above questions and your feedback before I can continue.

fizxmike commented 9 years ago

Hi @nicklasb, It is awesome that you got started on this! I don't have time till later next month to work on this.

Here are my suggestions, for what they are worth:

  1. I would simply add your keyword argument, include_mime_content, to the get_item function and default to False (and only include it in M.ItemShape when include_mime_content is True) it looks like get_item needs to stay general enough to retrieve any exchange_id, no need to create a separate, almost identical function, just make the original one more flexible for now: https://github.com/linkedin/pyexchange/blob/master/pyexchange/exchange2010/soap_request.py#L79
  2. No Comment.
  3. No Comment.
  4. In pycharm you can switch off PEP 8 formatting warnings (which is a little too restrictive and outdated - 2001 - in my opinion) in File -> Settings -> Inspections, under Python, uncheck: "PEP 8 naming convention violation" and "PEP 8 coding style violation" and we are all happy!
nicklasb commented 9 years ago
  1. You are right, I agree, no point at all to make an extra function.
  2. With regards to the tests, I have realized that it wasn't canned responses.
  3. I'll leave that one
  4. Ok. I know about those settings, but have the completely opposite point of view. :-) Which is that coding conventions are really helpful and makes it easier for everyone to understand code. Every project I am involved in, in any language, strives for strict adherence to convention, and in Python's case PEP 8 compliance. As I rather think that is the future and the way to move forward...possibly because all the chaotic projects from 2001 I was part of that didn't comply with anything. So I'll just ignore the funny colors. :-)

Anyway, I think I have enough to stand on to begin at some level. Putting up the m¤%¤¤#f#¤=#ng exchange server* took me almost two workdays, though, so I have lost a lot time, but I hope to be able to at least implement something.

fizxmike commented 9 years ago

RE: 4, I've obviously touched upon a hot topic of hot debate... I am not opposed to updating PEP 8 understanding that we all have modern text editors that are agnostic to spaces vs. tabs, and have 4K monitors that are agnostic to line ending length... other than that, I think PEP 8 is a good idea, in general.

RE: Microsoft, Don't you know that worse is better? http://www.jwz.org/doc/worse-is-better.html

got-root commented 9 years ago

Hey @nicklasb, I think it would be fine to go through and fix code to comply with PEP8. I've actually done that several times with the files I've worked with. However, maybe just exclude the 4 space indentation and line length rules. I haven't messed with PyCharm much, I'm more of a vim guy, but can you configure it to just ignore those two PEP8 rules?

Here is the style guide that @trustrachel wrote up when she made this project public: https://github.com/linkedin/pyexchange/blob/master/CONTRIBUTING.rst#style-guide

nicklasb commented 9 years ago

@fizxmike : Well, yes you kind of have. :-) So I'll just respond to your points:

Indentation: So if they are so agnostic, why no just follow the the PEP 8 guidelines that most other projects follow anyway? Then it wouldn't matter, would it? :-)

Line length, it is not just about screen size:

@got-root : PyCharm doesn't cut lines, one does that manually. I'll just set indentation to two tabs. Sorry, SPACES. :-)

I'll see what I'll do. I am a bit apprehensive about having my first commit to a project affect all its files. :-)

nicklasb commented 9 years ago

To get back on topic: I was just going to add the get_Item functionality when I saw that there two conflicting elements that aren't present in the other get_item-form.

Get_Item for messages: IncludeMimeContent

Get_Item for calendar events: *AdditionalProperties(contains FieldURI:s)

So, I will add both, and lay it upon the caller to include the correct ones, but raise an error if both are supplied, saving people from at least some inexplicable 500 errors. Because the other kinds, tasks and contacts, have no extra parameters, so it works for that use as well. Should be kind of future-safe.

New signature: def get_item(exchange_id, format=u"Default", include_mime_content=False, additional_properties=None)

nicklasb commented 9 years ago

Funny thing, again off topic: I realized that I am creating a new file and putting (c) 2013(5) LinkedIn Corp. at the top. First, I am giving away copyrights, which doesn't matter really, it is just a variant of another and LinkedIn fully deserves the credit here anyway.

But I don't know if I can give away copyright to someone without them formally accepting it. And all mergers here doesn't work at LinkedIn.

If I would create something completely from scratch in this project, should I then put (c) 2015 Optimal BPM at the top? Asking since all IDE:s and build environments I know of only have per-project copyrights.

nicklasb commented 9 years ago

The property management code in the folder and calendar seem a bit copy&pasted. Is it OK if I don't duplicate that one more time but instead creates a element-module and a BaseExchangeElement class that has an id, a service and property management? Especially considering that that contact and task is still to be implemented?

nicklasb commented 9 years ago

So I am spamming on, I have started implementing this now, as I have all these questions:

  1. _ parse_event_properties xpaths doesn't start from the event, but from the root of the XML document. This means that it only works in the special case where a result of a query has one event in the response, it cannot be used looping events unless these events reconstruct the tree up to the top. And that tree has to look exactly the same, it can not be used to parse an item contained in another structure. I am not sure why this is that way.
  2. After I add my code to exchange2010/_ _init.py, it grows to about 1200 rows. a) I was always under the impression that _ _init.py-files were used more as a API-presenters, importing stuff, and not holding the entire implementation. b) I would even without considering what nature _ _init__ should have like to split that module into smaller modules, basically one for each class. I mean they're not that small.
  3. I find lots of things that I would like to extract to other classes and so on.

So, my question is, when it is done, I will add my stuff, without doing structural changes. But beyond that, do any of you agree with me on these things? Because I might have misunderstood how stuff is supposed to be done.

nicklasb commented 9 years ago

Because with regards to messages, most fields are in the root of the item, an thus doesn't even need any xpath references, but can be inferred from correctly capitalizing the properties: https://github.com/OptimalBPM/pyexchange/blob/get_item/pyexchange/base/message.py

nicklasb commented 9 years ago

I have just realized that there are a bit too many questions that needs answering before I do this, my process will be too inefficient this way, so I will stop for now. I think that there are several structural issues that need to be addressed before I or @fizxmike start adding large amounts of code.

The way it is now, I would have to do way too much copying and pasting, the code base needs to become more extensible, there need to be a discussion about that that leads to defined tasks. If the ambition of pyexchange is to implement all exchange access methods, the structure must reflect that.

The branch I have worked with is this one, you can see the changes there: https://github.com/OptimalBPM/pyexchange/tree/get_item

The most important additions are the base message class: https://github.com/OptimalBPM/pyexchange/blob/get_item/pyexchange/base/message.py

and the changes at: https://github.com/OptimalBPM/pyexchange/blob/get_item/pyexchange/exchange2010/__init__.py

In short, and with the exception of importing all the properties of the message class using a spreadsheet, which was so fun I had to write about it, I am basically copying the calendar stuff and rewriting it for messages, and that doesn't feel very good, honestly, as there are huge amounts of code being duplicated.

catermelon commented 9 years ago

Hi @nicklasb, I've been away for the last couple of weeks, so I wanted to first apologize for not getting back to you! Let me try to address at least some of the items here.

The minor stuff first!

PEP8 - 2 spaces and unlimited line length is LinkedIn's in-house style, so that was a deliberate choice. I personally have come to prefer it over classic PEP8, so I want to keep that. I use PyCharm, too, and you can turn that particular alert off. PEP8 is fantastic, but it's not law. :)

Copyright - Not a lawyer, but as far as I understand it, yes, LinkedIn does own the copyright on the library since I work for them, but since it's released under the Apache 2 license they grant everybody the right to use, copy, modify, distribute, etc, etc, for any reason, including commercial. It's no different than my personal libraries - I own the copyright on them, but I license it under Apache as well. So yes, you would want to put the copyright notice in there. You have now come to the end of my understanding on copyright.

As for the other issues you raised. The library will 100% need some refactoring if we're going to make it implement all of the Exchange protocol. I don't like 1200 line files more than anybody else. It's perfectly fine to put stuff in __init__.py btw, but I agree for readability splitting things up is good.

A question for you - is there a part of the email protocol we could implement in the current style? That would let us figure out what parts of the library could use refactoring first.

nicklasb commented 9 years ago

PEP8: Meh. With that attitude, you'll never get that little "LinkedIn"-company or whatchamacallit out of the basement. :-)

Copyright: Me neither. Hm. One should really know about these things. Well, I'll put LinkedIn in there when I do it, I certainly make no claims.

We could, but its just uninspiring to copy&paste code all over, knowing that it all has to be rewritten soon anyway. Also, I have no customers breathing down my neck on this and don't positively need this functionality until in December or something. And if I needed to do a quick'n dirty solution I'd just use get_item and implement directly against the API. I can work directly against the XML response anyway, I don't really need to transform it into an Python object for my purposes. Even though perhaps that would be nicer.

But my point is that I'd rather do it correctly. And it seems to me that refactoring this project would not be a very complicated task, at least at its present state. But I don't feel like turning it inside out, I am not sufficiently structurally aware.

Yes, generalizing get_item could be done as we've talked about in this thread, and then at least people could get the messages easily.