Closed GoogleCodeExporter closed 8 years ago
Tested this in production environment, works well.
Original comment by mjg1964
on 21 Oct 2011 at 8:13
[deleted comment]
Hi henk-jan,
mapping raw input requires creating bots/usersys/mapping/raw folder with
__init__.py
This should probably be included in the upgrade plugin.
I have been testing this with PDF preprocess, works ok.
Kind Regards,
Mike
Original comment by mjg1964
on 12 Dec 2011 at 10:36
hi Mike,
you are right, that should be in.
can you take a look at the character set, I was wondering if this goes OK.
henk-jan
Original comment by hjebb...@gmail.com
on 12 Dec 2011 at 11:18
I am using iso8859-1 for input and output, receiving a PDF and converting to
text with just inn2out in mapping script for now. This works ok. As mentioned
on the pdf preprocess issue, I am now trying pdfminer for better results.
Original comment by mjg1964
on 13 Dec 2011 at 3:54
hi mike,
another option would be to use the character set from the channel.
iso8859-1 is OK most of the time. But eg Russia, Azia etc do use other sets.
I work with a lot of character sets.
This would also be more in line with the rest of editypes.
henk-jan
Original comment by hjebb...@gmail.com
on 13 Dec 2011 at 11:47
Hi henk-jan,
I created this "raw to raw" mapping script as an example. It gets the charset
from input and output channels and uses this to decode/encode respectively. I
don't have any files to test that require a special charset so have only tested
this with ascii files. This example just reads the input records and writes
output with no changes (apart from charset).
Actually I envisage that in a route only one side would be "raw", the other
side would be some known format that bots has a grammar for.
Raw input could be used eg. raw to edifact mapping for input from a free format
text file.
Raw output could be used eg. for edifact to pdf document using reportlab.
Original comment by mjg1964
on 26 Dec 2011 at 4:14
Attachments:
Original comment by hjebb...@gmail.com
on 22 Jun 2012 at 3:29
Original comment by hjebb...@gmail.com
on 10 Sep 2013 at 12:44
Original issue reported on code.google.com by
hjebb...@gmail.com
on 20 Oct 2011 at 11:02