sqlitebrowser / sqlitebrowser

Official home of the DB Browser for SQLite (DB4S) project. Previously known as "SQLite Database Browser" and "Database Browser for SQLite". Website at:
https://sqlitebrowser.org
Other
20.98k stars 2.13k forks source link

import csv-file deletes spaces #84

Closed deichbilder closed 10 years ago

deichbilder commented 10 years ago

Hi! When I import a csv-file into a table every single space was deleted (v 3.3.0). Special characters are not displayed correctly even though in v 2.0 it worked fine. What happened? Is it my fault? Can I make any settings to avoid at least the space-deleting? Thanks a lot for your help!

justinclift commented 10 years ago

Oh, that doesn't sound good. Do you have a part of the csv file you could share with us, so we can figure out what's going wrong? eg first few lines or something (they need to trigger the problem too :smile:)

If you do, you're welcome to email it to me (justin@postgresql.org), and I'll forward it to Martin and Rene for investigating. :smile:

rp- commented 10 years ago

Right now I can only give you 2 workarounds, until those bugs are fixed.

  1. save your CSV file in the UTF-8 encoding, e..g with libreoffice calc (character encoding can be set on export)
  2. Quote all your strings with whitespaces in it, I don't know if there is a CSV rfc or how strings in fields should really be handled, but i guess we shouldn't throw away whitespace between field separaters if they don't mess up the file format.
rp- commented 10 years ago

Fix list for us:

justinclift commented 10 years ago

If the bug is something to do with us stripping whitespace around field separators, maybe we should add a checkbox to the import dialog about "preserve whitespace"?

rp- commented 10 years ago

for some reason we don't add any spaces add all if they are not quoted see: https://github.com/sqlitebrowser/sqlitebrowser/blob/master/src/sqlitedb.cpp#L948

and since we do line based parsing now, we should never get 0x10 or 0x13 characters (else if above)

MKleusberg commented 10 years ago

The behaviour regarding spaces goes back to commit 37e195ad6fed542e8373368096c6d1d578762866 - I can kind of see the point of that commit as well but, at the moment, would tend to remove lines 946-952. Any objections or better ideas? Maybe adding a checkbox for trimming the field contents would be a good compromise?

rp- commented 10 years ago

checkbox sounds reasonable, I would save the content of a field in a separate buffer to allow post processing as soon as we hit a new field. It would also be nice to save the last used settings in the main config or per project.

I can do that if you are busy Martin.

MKleusberg commented 10 years ago

Yes, feel free to do that :smiley: Remembering the settings sounds good as well. I also though about extending the list in the encoding combobox automatically whenever a custom codec was used - e.g. user uses "ISO-8859-15", so it's safe to assume that this encoding might be used again. This maybe makes selecting encodings easier without having that endless list of never used formats in there.

rp- commented 10 years ago

Maybe the whole decode csv algorithm needs a rewrite, for example we don't support new lines in field data (if quoted), did this work with the old version? And I too don't think the sqlitedb.cpp is a good place for this routine.

Z4us commented 10 years ago

In fact there is: http://tools.ietf.org/html/rfc4180

As is written there spaces should not be ignored and use of double quotes depends on system/application; M$ExCel goes it's own way...

Kind regards / Vriendelijke groeten / Cordiali saluti, Klaas V

On Sun, Aug 31, 2014 at 8:38 AM, Peinthor Rene notifications@github.com wrote:

Right now I can only give you 2 workarounds, until those bugs are fixed.

  1. save your CSV file in the UTF-8 encoding, e..g with libreoffice calc (character encoding can be set on export)
  2. Quote all your strings with whitespaces in it, I don't know if there is a CSV rfc or how strings in fields should really be handled, but i guess we shouldn't throw away whitespace between field separaters if they don't mess up the file format.

— Reply to this email directly or view it on GitHub https://github.com/sqlitebrowser/sqlitebrowser/issues/84#issuecomment-53979559 .

justinclift commented 10 years ago

Cool @Z4us, that's potentially useful. At the very least it gives us a minimum reference spec to follow. And it's only about 4 pages long of content. (unless other RFC's I've seen) :smile:

justinclift commented 10 years ago

@deichbilder This should now be fixed in the latest code. Would you be ok to try it, to confirm it works for you now? Not sure which operating system you're on... if it's Windows, there are nightly builds available here:

    http://rp.oldsch00l.com/sqlitebrowser/sqlitebrowser.exe.xz

.xz is a type of compression. It can be uncompressed using 7-zip.

deichbilder commented 10 years ago

Great! I will try it asap (i'm on holidays for a few days). Os is Windows. thanks for fixing it so quickly! I'm sure it will work. I'll give you feedback. Am 04.09.2014 17:57 schrieb "Justin Clift" notifications@github.com:

@deichbilder https://github.com/deichbilder This should now be fixed in the latest code. Would you be ok to try it, to confirm it works for you now? Not sure which operating system you're on... if it's Windows, there are nightly builds available here:

http://rp.oldsch00l.com/sqlitebrowser/sqlitebrowser.exe.xz

.xz is a type of compression. It can be uncompressed using 7-zip http://www.7-zip.org/download.html.

— Reply to this email directly or view it on GitHub https://github.com/sqlitebrowser/sqlitebrowser/issues/84#issuecomment-54500427 .

deichbilder commented 10 years ago

I just tested - it works! :-) thanks a lot and have a nice sunday! Am 04.09.2014 18:23 schrieb "Andrea Schmidt" a.schmidt.kollmar@gmail.com:

Great! I will try it asap (i'm on holidays for a few days). Os is Windows. thanks for fixing it so quickly! I'm sure it will work. I'll give you feedback. Am 04.09.2014 17:57 schrieb "Justin Clift" notifications@github.com:

@deichbilder https://github.com/deichbilder This should now be fixed in the latest code. Would you be ok to try it, to confirm it works for you now? Not sure which operating system you're on... if it's Windows, there are nightly builds available here:

http://rp.oldsch00l.com/sqlitebrowser/sqlitebrowser.exe.xz

.xz is a type of compression. It can be uncompressed using 7-zip http://www.7-zip.org/download.html.

— Reply to this email directly or view it on GitHub https://github.com/sqlitebrowser/sqlitebrowser/issues/84#issuecomment-54500427 .

justinclift commented 10 years ago

Awesome, thanks for confirming. :smile: