Closed michele-deus closed 8 years ago
@michele-deus : What happens if you write to io.StringIO
instead of io.BytesIO
? Does it raise the same error?
I was writing to cStringIO, now writing to BytesIO says: 'unicode' does not have the buffer interface
Writing to io.StringIO works. So the problem must be related to cStringIO.
Yes, in my experience, any StringIO
that is not io.StringIO
is a recipe for disaster.
I take it you immediate issue is solved?
Anyway, if I understand your use case correctly, you are using cutplace to validate data that are already in a nice and cosy Python list. You are validated writing to a StringIO
just because the API does not yet provide any sensible way to validate in-memory data and insists on a file like object.
Am I correct? If so, it might be worthwhile to add some functionality to validate data outside of file like objects.
Yep I resolved my issue. In fact it's a file I'm reading and writing in the io.StringIO. I'm doing it this way because I prefer to do it line-by-line instead of validating the whole file, but probably this is my little knowledge of your library.
Thanks for the confirmation, closing this.
If you just want to validate and read a file line by line, use cutplace.rows
, for example:
import cutplace
for row in cutplace.rows('some_cid.ods', 'some_data.csv'):
pass # ...or do something with `row`.
An example error is (it is a cut and paste from the html output of a django view):
Validation Error: (R2875C1): cannot write data row: 'ascii' codec can't encode character u'\u20ac' in position 35: ordinal not in range(128); row=[u'B004', u'IDX', u'', u'HFRI Equity Index - NET (\u20ac)', u'BMK', u'Benchmark', u'P', u'Performance', u'AZ', u'20151230', u'1465.50016373', u'', u'EUR']
The code for opening the file is:
datafile = io.open(filename, mode='r', encoding=encoding, newline=line_delimiter )
encoding is taken from the CID and is UTF-16. To validate I write row-per-row with a cutplace.Writer to a BytesIO. For every row I do:
for line in datafile: line = line.strip(line_delimiter) raw_row = line.split(delimiter) row = [] for e in raw_row: row.append( e.strip() ) try: writer.write_row(row) except cutplace.errors.CutplaceError, err: log.append( ("at line %s: Validation Error: %s" % (row_nr, err), "(at line %s): %s" % (row_nr, line)) )