scribeocr / scribe.js

JavaScript OCR and text extraction for images and PDFs.
GNU Affero General Public License v3.0
12 stars 2 forks source link

Unable to use in projects created using Vue CLI #2

Closed WillemJann closed 1 month ago

WillemJann commented 2 months ago

I'm trying to use Scribe.js in an existing Vue 2 application. However, I'm getting Webpack compile errors after including the JS library. Here are the steps to reproduce:

  1. Make sure Vue CLI is installed:
    • npm install -g @vue/cli
  2. Create new project:
    • vue create hello-world
    • Select Default ([Vue 2] babel, eslint)
  3. Install Scribe.js from npm:
    • cd hello-world
    • npm i scribe.js-ocr
  4. Add import statement for Scribe.js
    • Edit src/App.vue and add import scribe from 'scribe.js-ocr' to <script> section
  5. Run application
    • npm run serve

Now the following errors are shown:

ERROR in ./node_modules/scribe.js-ocr/js/containers/fontContainer.js 18:12-28
Module not found: Error: Can't resolve 'module' in 'D:\hello-world\node_modules\scribe.js-ocr\js\containers'
 @ ./node_modules/scribe.js-ocr/scribe.js 6:0-59 86:30-37
 @ ./node_modules/babel-loader/lib/index.js??clonedRuleSet-40.use[0]!./node_modules/@vue/vue-loader-v15/lib/index.js??vue-loader-options!./src/App.vue?vue&type=script&lang=js 2:0-35
 @ ./src/App.vue?vue&type=script&lang=js 1:0-190 1:206-209 1:211-398 1:211-398
 @ ./src/App.vue 2:0-54 3:0-49 3:0-49 10:2-8
 @ ./src/main.js 2:0-28 5:17-20

ERROR in ./node_modules/scribe.js-ocr/js/containers/fontContainer.js 22:12-25
Module not found: Error: Can't resolve 'url' in 'D:\hello-world\node_modules\scribe.js-ocr\js\containers'

BREAKING CHANGE: webpack < 5 used to include polyfills for node.js core modules by default.
This is no longer the case. Verify if you need this module and configure a polyfill for it.

If you want to include a polyfill, you need to:
        - add a fallback 'resolve.fallback: { "url": require.resolve("url/") }'
        - install 'url'
If you don't want to include a polyfill, you can use an empty module like this:
        resolve.fallback: { "url": false }
 @ ./node_modules/scribe.js-ocr/scribe.js 6:0-59 86:30-37
 @ ./node_modules/babel-loader/lib/index.js??clonedRuleSet-40.use[0]!./node_modules/@vue/vue-loader-v15/lib/index.js??vue-loader-options!./src/App.vue?vue&type=script&lang=js 2:0-35
 @ ./src/App.vue?vue&type=script&lang=js 1:0-190 1:206-209 1:211-398 1:211-398
 @ ./src/App.vue 2:0-54 3:0-49 3:0-49 10:2-8
 @ ./src/main.js 2:0-28 5:17-20

ERROR in ./node_modules/scribe.js-ocr/js/containers/fontContainer.js 25:12-26
Module not found: Error: Can't resolve 'path' in 'D:\hello-world\node_modules\scribe.js-ocr\js\containers'

BREAKING CHANGE: webpack < 5 used to include polyfills for node.js core modules by default.
This is no longer the case. Verify if you need this module and configure a polyfill for it.

If you want to include a polyfill, you need to:
        - add a fallback 'resolve.fallback: { "path": require.resolve("path-browserify") }'
        - install 'path-browserify'
If you don't want to include a polyfill, you can use an empty module like this:
        resolve.fallback: { "path": false }
 @ ./node_modules/scribe.js-ocr/scribe.js 6:0-59 86:30-37
 @ ./node_modules/babel-loader/lib/index.js??clonedRuleSet-40.use[0]!./node_modules/@vue/vue-loader-v15/lib/index.js??vue-loader-options!./src/App.vue?vue&type=script&lang=js 2:0-35
 @ ./src/App.vue?vue&type=script&lang=js 1:0-190 1:206-209 1:211-398 1:211-398
 @ ./src/App.vue 2:0-54 3:0-49 3:0-49 10:2-8
 @ ./src/main.js 2:0-28 5:17-20

ERROR in ./node_modules/scribe.js-ocr/js/import/nodeAdapter.js 1:0-20
Module not found: Error: Can't resolve 'fs' in 'D:\hello-world\node_modules\scribe.js-ocr\js\import'
 @ ./node_modules/scribe.js-ocr/js/import/import.js 89:16-42
 @ ./node_modules/scribe.js-ocr/scribe.js 15:0-69 74:8-19 142:2-13 143:2-17
 @ ./node_modules/babel-loader/lib/index.js??clonedRuleSet-40.use[0]!./node_modules/@vue/vue-loader-v15/lib/index.js??vue-loader-options!./src/App.vue?vue&type=script&lang=js 2:0-35
 @ ./src/App.vue?vue&type=script&lang=js 1:0-190 1:206-209 1:211-398 1:211-398
 @ ./src/App.vue 2:0-54 3:0-49 3:0-49 10:2-8
 @ ./src/main.js 2:0-28 5:17-20

ERROR in ./node_modules/scribe.js-ocr/js/import/nodeAdapter.js 2:0-24
Module not found: Error: Can't resolve 'path' in 'D:\hello-world\node_modules\scribe.js-ocr\js\import'

BREAKING CHANGE: webpack < 5 used to include polyfills for node.js core modules by default.
This is no longer the case. Verify if you need this module and configure a polyfill for it.

If you want to include a polyfill, you need to:
        - add a fallback 'resolve.fallback: { "path": require.resolve("path-browserify") }'
        - install 'path-browserify'
If you don't want to include a polyfill, you can use an empty module like this:
        resolve.fallback: { "path": false }
 @ ./node_modules/scribe.js-ocr/js/import/import.js 89:16-42
 @ ./node_modules/scribe.js-ocr/scribe.js 15:0-69 74:8-19 142:2-13 143:2-17
 @ ./node_modules/babel-loader/lib/index.js??clonedRuleSet-40.use[0]!./node_modules/@vue/vue-loader-v15/lib/index.js??vue-loader-options!./src/App.vue?vue&type=script&lang=js 2:0-35
 @ ./src/App.vue?vue&type=script&lang=js 1:0-190 1:206-209 1:211-398 1:211-398
 @ ./src/App.vue 2:0-54 3:0-49 3:0-49 10:2-8
 @ ./src/main.js 2:0-28 5:17-20

ERROR in ./node_modules/scribe.js-ocr/js/utils/miscUtils.js 252:14-26
Module not found: Error: Can't resolve 'fs' in 'D:\hello-world\node_modules\scribe.js-ocr\js\utils'
 @ ./node_modules/scribe.js-ocr/scribe.js 22:0-107 114:45-63 115:44-61 116:52-77
 @ ./node_modules/babel-loader/lib/index.js??clonedRuleSet-40.use[0]!./node_modules/@vue/vue-loader-v15/lib/index.js??vue-loader-options!./src/App.vue?vue&type=script&lang=js 2:0-35
 @ ./src/App.vue?vue&type=script&lang=js 1:0-190 1:206-209 1:211-398 1:211-398
 @ ./src/App.vue 2:0-54 3:0-49 3:0-49 10:2-8
 @ ./src/main.js 2:0-28 5:17-20

ERROR in ./node_modules/scribe.js-ocr/js/worker/compareOCRModule.js 49:14-26
Module not found: Error: Can't resolve 'os' in 'D:\hello-world\node_modules\scribe.js-ocr\js\worker'

BREAKING CHANGE: webpack < 5 used to include polyfills for node.js core modules by default.
This is no longer the case. Verify if you need this module and configure a polyfill for it.

If you want to include a polyfill, you need to:
        - add a fallback 'resolve.fallback: { "os": require.resolve("os-browserify/browser") }'
        - install 'os-browserify'
If you don't want to include a polyfill, you can use an empty module like this:
        resolve.fallback: { "os": false }
 @ ./node_modules/scribe.js-ocr/js/recognizeConvert.js 52:55-93 23:55-93
 @ ./node_modules/scribe.js-ocr/scribe.js 19:0-111 78:123-132 102:43-59 136:2-12 139:2-13 148:2-11 149:2-15
 @ ./node_modules/babel-loader/lib/index.js??clonedRuleSet-40.use[0]!./node_modules/@vue/vue-loader-v15/lib/index.js??vue-loader-options!./src/App.vue?vue&type=script&lang=js 2:0-35
 @ ./src/App.vue?vue&type=script&lang=js 1:0-190 1:206-209 1:211-398 1:211-398
 @ ./src/App.vue 2:0-54 3:0-49 3:0-49 10:2-8
 @ ./src/main.js 2:0-28 5:17-20

ERROR in ./node_modules/scribe.js-ocr/js/worker/compareOCRModule.js 92:12-36
Module not found: Error: Can't resolve 'worker_threads' in 'D:\hello-world\node_modules\scribe.js-ocr\js\worker'
 @ ./node_modules/scribe.js-ocr/js/recognizeConvert.js 52:55-93 23:55-93
 @ ./node_modules/scribe.js-ocr/scribe.js 19:0-111 78:123-132 102:43-59 136:2-12 139:2-13 148:2-11 149:2-15
 @ ./node_modules/babel-loader/lib/index.js??clonedRuleSet-40.use[0]!./node_modules/@vue/vue-loader-v15/lib/index.js??vue-loader-options!./src/App.vue?vue&type=script&lang=js 2:0-35
 @ ./src/App.vue?vue&type=script&lang=js 1:0-190 1:206-209 1:211-398 1:211-398
 @ ./src/App.vue 2:0-54 3:0-49 3:0-49 10:2-8
 @ ./src/main.js 2:0-28 5:17-20

ERROR in ./node_modules/scribe.js-ocr/js/worker/compareOCRModule.js 100:12-24
Module not found: Error: Can't resolve 'fs' in 'D:\hello-world\node_modules\scribe.js-ocr\js\worker'
 @ ./node_modules/scribe.js-ocr/js/recognizeConvert.js 52:55-93 23:55-93
 @ ./node_modules/scribe.js-ocr/scribe.js 19:0-111 78:123-132 102:43-59 136:2-12 139:2-13 148:2-11 149:2-15
 @ ./node_modules/babel-loader/lib/index.js??clonedRuleSet-40.use[0]!./node_modules/@vue/vue-loader-v15/lib/index.js??vue-loader-options!./src/App.vue?vue&type=script&lang=js 2:0-35
 @ ./src/App.vue?vue&type=script&lang=js 1:0-190 1:206-209 1:211-398 1:211-398
 @ ./src/App.vue 2:0-54 3:0-49 3:0-49 10:2-8
 @ ./src/main.js 2:0-28 5:17-20

ERROR in ./node_modules/scribe.js-ocr/js/worker/compareOCRModule.js 103:12-26
Module not found: Error: Can't resolve 'util' in 'D:\hello-world\node_modules\scribe.js-ocr\js\worker'

BREAKING CHANGE: webpack < 5 used to include polyfills for node.js core modules by default.
This is no longer the case. Verify if you need this module and configure a polyfill for it.

If you want to include a polyfill, you need to:
        - add a fallback 'resolve.fallback: { "util": require.resolve("util/") }'
        - install 'util'
If you don't want to include a polyfill, you can use an empty module like this:
        resolve.fallback: { "util": false }
 @ ./node_modules/scribe.js-ocr/js/recognizeConvert.js 52:55-93 23:55-93
 @ ./node_modules/scribe.js-ocr/scribe.js 19:0-111 78:123-132 102:43-59 136:2-12 139:2-13 148:2-11 149:2-15
 @ ./node_modules/babel-loader/lib/index.js??clonedRuleSet-40.use[0]!./node_modules/@vue/vue-loader-v15/lib/index.js??vue-loader-options!./src/App.vue?vue&type=script&lang=js 2:0-35
 @ ./src/App.vue?vue&type=script&lang=js 1:0-190 1:206-209 1:211-398 1:211-398
 @ ./src/App.vue 2:0-54 3:0-49 3:0-49 10:2-8
 @ ./src/main.js 2:0-28 5:17-20

ERROR in ./node_modules/scribe.js-ocr/scrollview-web/scrollview/ScrollView.js 14:12-36
Module not found: Error: Can't resolve 'worker_threads' in 'D:\hello-world\node_modules\scribe.js-ocr\scrollview-web\scrollview'
 @ ./node_modules/scribe.js-ocr/js/recognizeConvert.js 493:18-70
 @ ./node_modules/scribe.js-ocr/scribe.js 19:0-111 78:123-132 102:43-59 136:2-12 139:2-13 148:2-11 149:2-15
 @ ./node_modules/babel-loader/lib/index.js??clonedRuleSet-40.use[0]!./node_modules/@vue/vue-loader-v15/lib/index.js??vue-loader-options!./src/App.vue?vue&type=script&lang=js 2:0-35
 @ ./src/App.vue?vue&type=script&lang=js 1:0-190 1:206-209 1:211-398 1:211-398
 @ ./src/App.vue 2:0-54 3:0-49 3:0-49 10:2-8
 @ ./src/main.js 2:0-28 5:17-20

ERROR in node:fs
Module build failed: UnhandledSchemeError: Reading from "node:fs" is not handled by plugins (Unhandled scheme).
Webpack supports "data:" and "file:" URIs by default.
You may need an additional plugin to handle "node:" URIs.
    at D:\hello-world\node_modules\webpack\lib\NormalModule.js:973:10
    at Hook.eval [as callAsync] (eval at create (D:\hello-world\node_modules\tapable\lib\HookCodeFactory.js:33:10), <anonymous>:6:1)
    at Hook.CALL_ASYNC_DELEGATE [as _callAsync] (D:\hello-world\node_modules\tapable\lib\Hook.js:18:14)
    at Object.processResource (D:\hello-world\node_modules\webpack\lib\NormalModule.js:969:8)
    at processResource (D:\hello-world\node_modules\loader-runner\lib\LoaderRunner.js:220:11)
    at iteratePitchingLoaders (D:\hello-world\node_modules\loader-runner\lib\LoaderRunner.js:171:10)
    at runLoaders (D:\hello-world\node_modules\loader-runner\lib\LoaderRunner.js:398:2)
    at NormalModule._doBuild (D:\hello-world\node_modules\webpack\lib\NormalModule.js:959:3)
    at NormalModule.build (D:\hello-world\node_modules\webpack\lib\NormalModule.js:1144:15)
    at D:\hello-world\node_modules\webpack\lib\Compilation.js:1418:12
    at NormalModule.needBuild (D:\hello-world\node_modules\webpack\lib\NormalModule.js:1466:32)
    at Compilation._buildModule (D:\hello-world\node_modules\webpack\lib\Compilation.js:1399:10)
    at D:\hello-world\node_modules\webpack\lib\util\AsyncQueue.js:324:10
    at Hook.eval [as callAsync] (eval at create (D:\hello-world\node_modules\tapable\lib\HookCodeFactory.js:33:10), <anonymous>:6:1)
    at AsyncQueue._startProcessing (D:\hello-world\node_modules\webpack\lib\util\AsyncQueue.js:314:26)
    at AsyncQueue._ensureProcessing (D:\hello-world\node_modules\webpack\lib\util\AsyncQueue.js:301:12)
    at process.processImmediate (node:internal/timers:476:21)
 @ ./node_modules/scribe.js-ocr/js/fontContainerMain.js 18:12-29
 @ ./node_modules/scribe.js-ocr/scribe.js 13:0-79 50:20-39 138:2-15
 @ ./node_modules/babel-loader/lib/index.js??clonedRuleSet-40.use[0]!./node_modules/@vue/vue-loader-v15/lib/index.js??vue-loader-options!./src/App.vue?vue&type=script&lang=js 2:0-35
 @ ./src/App.vue?vue&type=script&lang=js 1:0-190 1:206-209 1:211-398 1:211-398
 @ ./src/App.vue 2:0-54 3:0-49 3:0-49 10:2-8
 @ ./src/main.js 2:0-28 5:17-20

webpack compiled with 13 errors

Similar errors appear in a blank Vue 3 application, using preset Default ([Vue 3] babel, eslint). I think the issue is related to Webpack 5 and its configuration, but until now I'm unable to resolve all errors. It seems Webpack doesn't recognize that it is in browser mode and that some imports can be skipped.

Do you have any suggestions to get Scribe.js working in a Vue CLI application?

WillemJann commented 2 months ago

I already tried to install node-polyfill-webpack-plugin and configure it in vue.config.js:

var NodePolyfillPlugin = require('node-polyfill-webpack-plugin')

module.exports = {
    [...],
    chainWebpack: config => {
        config.plugin('polyfills').use(NodePolyfillPlugin)
    }
}

That will resolve some of the errors, but not all of them. In your source code I see checks for browser environments / Node.js environments, so I'm expecting these polyfills aren't required. For some reason Webpack doesn't follow the logic correctly to skip the Node.js related imports.

I also tried the following way of importing, but that doesn't work either:

import scribe from 'node_modules/scribe.js-ocr/scribe.js'
Balearica commented 2 months ago

I will get to responding to this later, however I just deleted a post from a new account claiming you should run a (presumably) malicious binary file. Obviously you should not do that.

victobui commented 2 months ago

I have another issue for my vite/react config. When I do the import I get type Could not find a declaration file for module 'node_modules/scribe.js-ocr/scribe.js' frontend/node_modules/scribe.js-ocr/scribe.js' implicitly has an 'any' type. I am wondering if to over come this I simply need to create a type folder with a ts config as well. But I need this to be a browser ran project not a node.js for privacy concerns. If any one can help that would be great.

victobui commented 2 months ago

I have another issue for my vite/react config. When I do the import I get type Could not find a declaration file for module 'node_modules/scribe.js-ocr/scribe.js' frontend/node_modules/scribe.js-ocr/scribe.js' implicitly has an 'any' type. I am wondering if to over come this I simply need to create a type folder with a ts config as well. But I need this to be a browser ran project not a node.js for privacy concerns. If any one can help that would be great.

I also would like to know if running native in the browser, do you use only tesseract.js's engine or do you have a layer on top that will work for pdf's as well.

Balearica commented 2 months ago

I also would like to know if running native in the browser, do you use only tesseract.js's engine or do you have a layer on top that will work for pdf's as well.

All of the features of Scribe.js work in both the browser and Node.js. Scribe.js can be used to read and write to PDFs. If you have further questions about PDFs, please open a separate discussion, as this is outside of the scope of building with Vue.

Balearica commented 2 months ago

@WillemJann I was able to replicate this issue. It looks like Scribe.js does not currently work with Webpack (which is used by Vue.js v2) due to (1) issues with Webpack not detecting when code is Node.js or browser specific and (2) issues with path traversal. I do want to support Webpack, and do not believe it will be difficult to do so, so I'll try and have a patch version released by Wednesday or Thursday this week.

WillemJann commented 2 months ago

Thank you. Great that you want to support Webpack, looking forward to you patch. As soon as it is released I will test it in our environment.

Balearica commented 1 month ago

I just released scribe.js v0.2.4, which makes a number of changes to improve support with Webpack. It is now possible to build with Webpack 5 using only a couple of non-default settings. Specifically, you need to explicitly define process as undefined to force Webpack to skip the Node.js code, and define DISABLE_DOCX_XLSX to true to skip the .docx/.xlsx export code, which is still problematic in Webpack.

  plugins: [
    new webpack.DefinePlugin({
      process: JSON.stringify(undefined),
      'DISABLE_DOCX_XLSX': JSON.stringify(true),
    }),
  ]

An example repo using scribe.js with Webpack 5 is here, and an example repo using a Vue.js v2 application created with the Vue CLI tool is here. I've added these repos to a new section of the readme here. Everybody is encouraged to add new example repos with different frameworks/build tools, particularly if they require non-trivial setup.

Let me know if this update resolved the issue with installing in your project.

WillemJann commented 1 month ago

Yes, I can confirm that with the update (v0.2.4) Scribe.js is working in our Vue 2 project, created with the Vue CLI. Our vue.config.js contains the following custom Webpack settings:

var webpack = require('webpack')

module.exports = {
    configureWebpack: {
        plugins: [
            new webpack.DefinePlugin({
                process: JSON.stringify(undefined),
                DISABLE_DOCX_XLSX: JSON.stringify(true)
            })
        ],
        output: {
            environment: {
                asyncFunction: true
            }
        }
    }
}

The last part (explicit environment support for async function and await) is added to resolve this message:

 warning  in ./node_modules/scribe.js-ocr/js/worker/generalWorker.js

The generated code contains 'async/await' because this module is using "topLevelAwait".
However, your target environment does not appear to support 'async/await'.
As a result, the code may not run as expected or may cause runtime errors.

Thank you for adding support for Webpack! The issue is resolved.