hexenq / kuroshiro

Japanese language library for converting Japanese sentence to Hiragana, Katakana or Romaji with furigana and okurigana modes supported.
https://kuroshiro.org
MIT License
810 stars 92 forks source link

Error when calling kuroshiro init [invalid file signature] even though files are found #27

Open Fxxxlei opened 6 years ago

Fxxxlei commented 6 years ago

Hello, When calling kuroshiro.init in the browser, I get this error :

Uncaught Error: invalid file signature:36,5
    at $.g (kuroshiro.js:9052)
    at XMLHttpRequest.xhr.onload (kuroshiro.js:7685)

where 36,5 can be replaced by several different pairs of numbers. This error happens 12 times and resembles the one from #24 , but differs since the files are found, the dicPath is correct, and the file-sizes match. Using the same dictionary files works in Node.js. I have installed kuroshiro using bower.

hexenq commented 6 years ago

Thanks for the feedback. Are you using webpack or other module bundler? The error message seems throwed out by zlib. Usually it cause by corrupted file. Though files works in Node.js, I suggest you re-download those files and try again see if it works.

Fxxxlei commented 6 years ago

After looking at it again, I too think it has something to do with zlib. I tried redownloading the files, but it didn't help. I guess either something goes amiss during the download of the dictionaries by the browser from my server, or zlib doesn't work correctly in my browser, though both would probably be unrelated to kuroshiro.

I am not using any bundler or webpack. I installed kuroshiro with bower and tried including it with a script-tag directly into the html page, as well as importing it as an ES6-module in a script of my own. Everything had the same result.

I'll try playing around with the zlib library itself, to look how it works in my environment.

hexenq commented 6 years ago

Also look into the browser you use which may cause the error.

toanlcgift commented 4 years ago

some server auto decompress .gz in response (ex: python django), so just modify kuromoji/src/loader/BrowserDictionaryLoader.js and build your own js

BrowserDictionaryLoader.prototype.loadArrayBuffer = function (url, callback) {
    var xhr = new XMLHttpRequest();
    xhr.open("GET", url, true);
    xhr.responseType = "arraybuffer";
    xhr.onload = function () {
        if (this.status > 0 && this.status !== 200) {
            callback(xhr.statusText, null);
            return;
        }
        var arraybuffer = this.response;

        //var gz = new zlib.Zlib.Gunzip(new Uint8Array(arraybuffer));
        //var typed_array = gz.decompress();
        callback(null, arraybuffer);
    };
    xhr.onerror = function (err) {
        callback(err, null);
    };
    xhr.send();
};
chinenvinicius commented 2 years ago

Also look into the browser you use which may cause the error. @hexenq i have test in all browser im having the same issue. for a strange reason dict files is loading on index page . but when i go to other pages. this .dat.gz files doens load files at all. i get 404 not found error . this should be fixed. i dont think it has to do with my browser .

the error is coming from this line of code

},{"./TokenizerBuilder":4,"./dict/builder/DictionaryBuilder":14}],16:[function(require,module,exports){
"use strict";var zlib=require("zlibjs/bin/gunzip.min.js"),DictionaryLoader=require("./DictionaryLoader");function BrowserDictionaryLoader(r){DictionaryLoader.apply(this,[r])}BrowserDictionaryLoader.prototype=Object.create(DictionaryLoader.prototype),BrowserDictionaryLoader.prototype.loadArrayBuffer=function(r,e){var o=new XMLHttpRequest;o.open("GET",r,!0),o.responseType="arraybuffer",o.onload=function(){if(this.status>0&&200!==this.status)e(o.statusText,null);else{var r=this.response,t=new zlib.Zlib.Gunzip(new Uint8Array(r)).decompress();e(null,t.buffer)}},o.onerror=function(r){e(r,null)},o.send()},module.exports=BrowserDictionaryLoader;