Closed wby238 closed 7 years ago
Thank you for your report.
Could you possibly provide me the problematic JSON data? I've tried to reproduce the problem with UTF-8 (without BOM) JSON file, but I couldn't (sqlitebiter completed without errors).
I'm sorry that now I can't use my pc. That json has many chinese words, you can write a simple json to test it, such as:
{
"sites": [
{ "name":"菜鸟教程" , "url":"m.runoob.com" },
{ "name":"google" , "url":"www.google.com" },
{ "name":"微博" , "url":"www.weibo.com" }
]
}
The reason may be sqlitebiter read the json as gbk codec, although that json is utf8.
The following method can get the original json:
Thank you for your reply.
debug info:
[DEBUG] sqlitebiter file: converting 'test.json'
[DEBUG] pytablereader: TableFileLoaderFactory.create_from_path: extension=json
[DEBUG] pytablereader: loading file: format=json, path=test.json, encoding=utf-8
[ERROR] sqlitebiter file: invalid json data format: path=test.json, message='gbk' codec can't decode byte 0xa6 in position 336: illegal multibyte sequence
[DEBUG] sqlitebiter file: ----- sqlitebiter completed -----
[DEBUG] sqlitebiter file: database path: test.sqlite
[DEBUG] sqlitebiter file: number of created table: 0
[DEBUG] sqlitebiter file:
[DEBUG] sqlitebiter file: ----- database schema -----
[DEBUG] SimpleSQLie: connect to a SQLite database: path='test.sqlite', mode=r
[DEBUG] SimpleSQLie: close connection to a SQLite database: path='G:\test.sqlite'
[DEBUG] sqlitebiter file:
[DEBUG] SimpleSQLie: close connection to a SQLite database: path='G:\test.sqlite'
original json: test.zip
Thank you for your additional information. I could reproduce the problem.
It looks like there are two problems, encoding and JSON format:
sqlitebiter
not properly opening for utf8 input files some formats (include JSON files).
This cause an encoding error at Windows environment.
Original JSON file (test.json
) that you mentioned in the previous comment is not convertible as it is by sqlitebiter
.
Convertible JSON formats for sqlitebiter
are very limited for now (details are described in here).
I've fix a dependency package that loading files.
Please try upgrade the sqlitebiter
dependency packages with pip install sqlitebiter --upgrade
.
You could convert utf8 files properly.
>type utf.json
{
"sites": [
{ "name":"闖憺ク滓蕗遞・ , "url":"m.runoob.com" },
{ "name":"google" , "url":"www.google.com" },
{ "name":"蠕ョ蜊・ , "url":"www.weibo.com" }
]
}
>sqlitebiter file utf.json
[INFO] sqlitebiter file: convert 'utf.json' to 'sites' table
>sqlite3 out.sqlite
sqlite> .schema
CREATE TABLE IF NOT EXISTS 'sites' (name TEXT, url TEXT);
You could extract convertible JSON data by jq command from the JSON, and convert it by sqlitebiter
.
If you have a certain database schema that you would like to extract from the JSON file.
Thank you for your reply!
Env:x64 win7, python3.6.2
I used 'sqlitebiter file test.json -o test.sqlite', the test.json is encoded as utf-8 without bom, but I get the following error: