Open kjohnsonecl opened 3 years ago
Well, now I feel ignorant and stupid, because just dumping in plain text (original issue) comes out like I am SHOUTING the bits that were comments in shell scripts. argh.
Sometimes you need to step outside your comfort zone.
grep won't display the offending line in the .sql file, but...
mumble# sqlite3 bacula.db
sqlite> SELECT * FROM Filename WHERE FilenameId='46734';
46734|Pic - Pilot?s Interconnecting Box �(105A15).htm
So, one way or another I can solve the problem with that record and then move on to find any more lurking in that table.
I still think it would be a plus for the message to have more info.
I think it would be helpful if pgloader could provide primary key information on the record that triggers the UTF-8 error.
I am trying to convert a Bacula database from sqlite to postgresql using pgloader. pgloader seems like an amazing tool which I need as I am not a database specialist, more of a system and network administrator generalist. Because of this, I am probably overlooking ways of doing things directly with the database tools, and instead, using the tools I know.
I have been loading the tables one at a time and resolving issues in each one as I go, for the most part. (There are several tables with issues around timestamp casting that I have put off to the end.)
The filename table has two columns: a filename ID, and a filename. When I try to load that table, I encounter the "ERROR Illegal :UTF-8 character starting at position 34." error message, and table loading stops. At least, I think that causes table loading to stop. That is the only "ERROR" reported in the output file.
My thinking was that if I could find the record that triggers the UTF-8 error, I could figure out a solution to this problem. Or at least make progress. I tried several ways of examining the .sql file used to build the bacula.db file used for conversion (See first bash script below), but without useful results. Then I tried turning on pgloader -debug. That didn't help directly, but about that time I realized that perhaps I could figure something out by looking at the records that did get loaded (about 45000) vs. the ones that did not (about 400000).
The output from pg_dump lists the records from the filename table in filename ID order. This matches the order in the bacula.sql file, so it seemed not stupid to think that the next record in filename ID (46735) order was the problematic one. I deleted that record from the .sql file and rebuilt the db file using the first of the shell scripts. Then I re-ran pgloader with the second shell script. Same error.
I am willing to give up some assumptions, but I do not have a good idea of where to start. Hence, if the UTF-8 error message could come with more data...
[ ] pgloader --version
[ ] did you test a fresh compile from the source tree?
No
[ ] did you search for other similar issues?
Yes
[ ] how can I reproduce the bug?
Script used to modify and prepare database for conversion
Script used to run pgloader
pg.load file used to attempt conversion:
From pg_dump:
from bacula.sql-ref: