Feature request: Implement context managers in pyodbc

GoogleCodeExporter commented 9 years ago

I would like to write python code like this:

import pyodbc
with pyodbc.connect(dsn='mydsn') as conn:
    with conn.cursor() as curs:
        for row in curs.execute('select * from mytable')
            ...

To make this possible, we will need the connect() and cursor() objects to 
implement the __enter__ and __exit__ methods.  I would think that the 
__enter__ method should just return the object:

def __enter__(self):
    return self

The __exit__ method needs to be aware of any exceptions that occurred and 
call its close method.  For the connect() object, it should also do a 
commit, unless a rollback already happened:

def __exit__(self, exc_type, exc_value, traceback):
    if exc_type == None:
        if not self.rollback:  # True if call to rollback
            self.commit()
    self.close()
    return exc_type

For cursor() objects, if autocommit=True, it should call commit() (unless 
rollback was called).  Then it should call close():

def __exit__(self, exc_type, exc_value, traceback):
    if exc_type == None:
        if self.autocommit and not self.rollback:
            self.commit()
    self.close()
    return exc_typ

Original issue reported on code.google.com by gerald.b...@gmail.com on 1 Jun 2010 at 5:25

GoogleCodeExporter commented 9 years ago

This is not a defect, just a feature request

Original comment by gerald.b...@gmail.com on 1 Jun 2010 at 5:26

GoogleCodeExporter commented 9 years ago

The reason I've never implemented this is because you can write code like this:

def f():
  conn = pyodbc.connect(dsn='mydsn')
  curs = conn.cursor()
  for row in curs.execute('select * from mytable')
      ...

Since Python uses reference counting, the cursor and connection will be closed 
immediately when you exit the function, either normally or by via an exception.

I'm not sure what benefit there would be to a context manager.  Therefore, I've 
been 
holding off implementing it so newbies don't think pyodbc is so complicated 
(not to 
mention I'd never use them myself ;).  Do you have any use cases where it would 
be 
useful?

Original comment by mkleehammer on 1 Jun 2010 at 5:31

Changed state: Investigating
Added labels: Type-Enhancement
Removed labels: Type-Defect

GoogleCodeExporter commented 9 years ago

I guess I have two observations:

1. While it is true that Python uses reference counting, that does not mean 
that 
connect.close() or cursor.close() are called automatically when the reference 
count 
drops to zero.  (In fact, I would consider it a bug if Python did that).  For 
example, this does not happen for file objects unless you use a context manager.

e.g.

def f():
   f = open('myfile')
   ...

# The file is still open, even though I have no reference to it.  OS resources 
are 
still in use.  Even if Python GCs its resources, it does not auto-close the 
file.

with open('myfile') as f:
   ...

# The file is now closed, since I exited the context

2. I would like to see the context managers call the commit() function in the 
__exit__ method before calling close().

Original comment by gerald.b...@gmail.com on 1 Jun 2010 at 5:43

GoogleCodeExporter commented 9 years ago

Actually it does work the way I mentioned.  I can assure you that the file is 
closed 
when you leave the function f.  (I even wrote a quick test this morning, but 
since 
I'm familiar with the Python internals, I was sure.)  It would be a bug (a 
memory 
leak) if it did not work that way.

Here's one way to think about it: Once you leave f(), there are *no* references 
left 
to the file, not even internal ones, so there is no way it could be closed 
later.  
What reference would later code use to close the file?

All objects are destroyed when there reference count reaches zero.  The garbage 
collector is only needed for breaking cycles, like two objects that point to 
each 
other.  In that case, both would have a refcount of 1 - the cycle collector 
detects 
these and destroys both objects.

If you really want to test it for yourself, here is an easy way.  Instead of 
using a 
file, create your own class and print something in the __del__ method, which is 
called when an object is destroyed.

Similarly, I can assure you that the database cursor and connection will be 
closed in 
this example when f() is exited, either normally or due to an exception being 
raiesed.

def f():
  cnxn = pyodbc.connect(...)
  cursor = cnxn.cursor()
  ...
  cnxn.commit()

As for an autocommit, that would actually make coding more difficult.  In the 
example 
above, notice the commit call at the bottom.  This pattern ensures that 
database work 
is only committed if the function successfully makes it to the bottom.

If an exception is raised, the cursor and connection are still closed as the 
function 
is exited.  However, since the commit didn't happen, closing the connection 
causes 
any work done in f() to be rolled back, which is exactly what you want if there 
is a 
failure.  It may have written to 2 out of 3 tables and encountered an error in 
the 
3rd -- since they all didn't complete, you want the transaction to be 
automatically 
rolled back.

In summary, I considered the context manager, and would still be happy to write 
it, 
but:

* I'd rather people understand how it works and why they don't need it.  If I 
put it 
in, lots of new pyodbc programmers would be taught that they need it, causing 
confusion.

* The code without is much simpler -- there is almost nothing extra.

* The pattern is pretty foolproof as long as the commit is not forgotten.  
Adding an 
automatic commit at close would be disastrous though.  (Obviously there is the 
autocommit mode, but that should be discouraged except for use in tiny 
utilities.)

* I haven't found a single benefit yet.

* Finally, wrappers for Connection and Cursor could easily be created to add 
this, 
though I don't know why.  (I've created wrappers a few times, for things like 
internal logging, etc.  It's about 20 lines of code.)

Does that make sense?  Perhaps I should create a page for this topic in the 
docs?

Original comment by mkleehammer on 3 Jun 2010 at 3:10

GoogleCodeExporter commented 9 years ago

Yes, of course the __del__ method is called when an object is destroyed, but 
that is 
not what you said before.  You said that the close() method was called. (FWIW, 
it's 
easy to verify that that is not the case.) Now, it may be that your 
implementation of 
__del__ calls close (as does file.  I was wrong there).  Fair enough.

If autocommit were done in the __exit__ method of a context manager, the method 
would 
first see if an exception had occurred (first arg is not None). Only if no 
exception 
occurred would commit be called.  Thus I could write:

with conn.cursor() as blah:
  # do stuff

and know that, if the suite exits normally, my changes would be committed.  If 
an 
exception occurs, they would be rolled back.  Simpler code!  The smarts are in 
the 
context manager.  I can't forget to commit my transactions and my data is safe 
(rollback) if an error occurs.  

Thus, these two would be equivalent:

def f():
   all_is_good = True
   blah = conn.cursor()
   # do stuff
   if something_bad_happens:
     all_is_good = False
   if all_is_good:
      conn.commit()

with conn.cursor() as blah:
   #do stuff
   if something_bad_happens:
       raise SomethingBadHappened

The first approach uses Python's function machinery and implicit garbage 
collection 
to get the desired effect.  The second approach explicitly exploits the context 
manager syntax to do the same thing, with less coding.  "Explicit is better 
than 
implicit" or so the Zen of Python claims.

In summary:

* One could use either approach depending on preference and experience.

* The code using a context manager is simpler and more robust.

* The pattern is foolproof AND you can forget about calling commit.  
(Autocommit in 
__exit__ is not disastrous following the approach I give.)

* Benefits include simpler coding style and no need to create functions just to 
get 
this kind of effect.

Of course, it's easy to create wrappers to do this.  I've done that many times 
with 
other objects, including dbs, log files, locks and more.  I simply lobbying for 
support in the package to make that unnecessary.

Original comment by gerald.b...@gmail.com on 3 Jun 2010 at 4:10

GoogleCodeExporter commented 9 years ago

I think you've convinced me due to the automatic commit by looking at the 
exception 
information.

One small nit to make it a fair comparison though, I never use a flag for 
committing 
since an early return or exception ensures commit isn't called:

def f():
   cnxn = pyodbc.connect()
   cursor = cnxn.cursor()
   # do stuff
   cnxn.commit()

You'll notice I create a new connection each time since I'm using connection 
pooling.  

More importantly, since commits are at the *connection* level, not at the 
cursor 
level, I would think we'd want the connection in the with, not the cursor.  I'd 
just 
let the cursor autoclose.

def f():
  with connect(...) as cnxn:
      cursor = cnxn.cursor()
      # do something

Side note: Having to allocate a separate cursor is extra work 99.9% of the 
time.  I 
actually added a Connection.execute() method which allocates a cursor for you.  
If 
you are only executing a single statement, it is very handy:

def f():
  with connect(...) as cnxn:
      rows = cnxn.execute(...).fetchall()

Anyway, thoughts on adding it to the connection only?

Original comment by mkleehammer on 3 Jun 2010 at 4:22

GoogleCodeExporter commented 9 years ago

I see far less benefit to adding it to the cursor.  That is quite arguably 
syntatic 
sugar (though I _like_ sugar!) and though mostly harmless (like Earth! (Douglas 
Adams)), may not be worth it.  Just to be sure though, what is the downside to 
not 
(maybe never) calling cursor.close()?

Original comment by gerald.b...@gmail.com on 3 Jun 2010 at 4:26

GoogleCodeExporter commented 9 years ago

There is no downside because Cursor.__del__ and Connection.__del__ both close 
their 
resources.

Original comment by mkleehammer on 3 Jun 2010 at 4:45

GoogleCodeExporter commented 9 years ago

In the v2unicode branch, which will become 2.1.8 as soon as I can get it tested 
on all the different platforms.

Original comment by mkleehammer on 5 Sep 2010 at 6:19

GoogleCodeExporter commented 9 years ago

Fixed in 2.1.8

Original comment by mkleehammer on 6 Sep 2010 at 5:36

Changed state: Fixed

GoogleCodeExporter commented 9 years ago

Original comment by mkleehammer on 21 Nov 2010 at 4:44

Changed state: Complete

GoogleCodeExporter commented 9 years ago

Comment 6 by project member mkleehammer, Jun 3, 2010 
I think you've convinced me due to the automatic commit by looking at the 
exception 
information.

but...I don't see the code.  Thus I'm confused why this enhancement is marked 
as "Complete".

__exit__ simply calls Connection_exit which calls Connection_clear (in 
Connection.cpp)...none of these routines check the exception state to decide 
whether to commit() or to rollback().  Shouldn't __exit__ call commit() if 
there have been no errors?  I thought that was the point of making the 
connection a context manager.

sqlite does this.  11.13.7.3. Using the connection as a context manager¶
New in version 2.6.

http://docs.python.org/library/sqlite3.html

Connection objects can be used as context managers that automatically commit or 
rollback transactions. In the event of an exception, the transaction is rolled 
back; otherwise, the transaction is committed:

So does cx-oracle: Connection.__exit__()
The exit point for the connection as a context manager, a feature available in 
Python 2.5 and higher. In the event of an exception, the transaction is rolled 
back; otherwise, the transaction is committed.

http://cx-oracle.sourceforge.net/html/connection.html

Original comment by je...@livedata.com on 15 Aug 2011 at 9:41

djhenderson / pyodbc

Feature request: Implement context managers in pyodbc #100