This document describes the goals and the basic design for cutplace version 0.8, which
addresses shortcomings and major issues with cutplace 0.7.
Goals
cutplace works with Python 3.4. If possible with reasonable effort it should also work
with earlier version of Python 3. From experience, Python 3.3 and 3.2 should be easy
to support.
cutplace works with Python 2.6 and Python 2.7 using the same code.
The API is more pythonnic and looks less like a Python API designed by a Java developer
;-)
All important functions and classes are accessible using "import cutplace". In
particular Abstract*, CutplaceError, Interface
The ICD (interface control document) acronym is changed to CID (cutplace interface
description), which is much more pleasant to pronounce.
The Decimal field format supports rules, in particular to specify a valid range and
possibly a format similar to Basic's "using" statement.
Readers make do without thread, thus fixing all current multi threading bugs (e.g.
sometimes getting stuck after errors, sometimes discarding the last row, performance
issues caused by polling loop hacks etc.)
ValidationListener is gone, iterators and possibly callback functions should suffice
in order to reduce API complexity.
Validation is also possibly for writing data.
The last line of fixed format files can be processed properly.
Validation can be disabled, in which case cutplace resorts to a optimistic reader that
only validates that the number of rows or line length matches the expectations but
does not perform and checks on field formats or rows.
The installation is performed using pip and setuptools. For Python 3.2+, https://pypi.python.org/pypi/setuptools must be installed first. For Mac OS X with Mac Ports, use:
sudo port install py34-setuptools
The API includes simple functions to validate and read data:
API compatibility with cutplace 0.7 can be dropped if deemed advantageous or results
in cleaner code that is easier to understand and maintain.
Design
Delimited and CSV file use the csv module of the standard library.
Fixed format files use io.open().
Excel files use xlrd.
ODS files have to be examined. The current built in ods.py would have to be rewritten
to read the XML without threads. Ignore memory issues, using ElementTree from the
standard library should be the easiest way to iterate over the relevant elements. It
might also be worthwhile to examine the various ODS readers on PyPI although
currently it seems none of them has reached production stability or gained widespread
popularity.
Using the same code with Python 2 and 3
Each module starts with:
from __future__ import absolute_import
from __future__ import divison
from __future__ import print_function
from __future__ import unicode_literals
In case filter(), map() or zip() are used, they are imported from future_builtins.
functions that have changed between Python 2 and 3 are collected in
cutplace.compat and use the API of Python 3. Example: csv.read(), which must be
wrapped in UnicodeReader under Python 2.
This document describes the goals and the basic design for cutplace version 0.8, which addresses shortcomings and major issues with cutplace 0.7.
Goals
Design
Using the same code with Python 2 and 3
future_builtins
.argparse
with Python 2, setup.py depends on https://pypi.python.org/pypi/argparse for Python 2.x only.cutplace.compat
and use the API of Python 3. Example: csv.read(), which must be wrapped in UnicodeReader under Python 2.API overview