jehna / humanify

Deobfuscate Javascript code using ChatGPT
MIT License
1.73k stars 74 forks source link

Better error handling/user guidance for missing local models #53

Open 0xdevalias opened 3 months ago

0xdevalias commented 3 months ago

Currently when trying to run with a local model that isn't downloaded, the app crashes with an error such as the following:

⇒ npx humanifyjs local --disableGpu foo.js
(node:96922) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
node:internal/fs/promises:638
  return new FileHandle(await PromisePrototypeThen(
                        ^

Error: ENOENT: no such file or directory, open '/Users/devalias/.humanifyjs/models/Phi-3.1-mini-4k-instruct-Q4_K_M.gguf'
    at async Object.open (node:internal/fs/promises:638:25)
    at async GgufFsFileReader._readByteRange (file:///Users/devalias/dev/foohumanify/node_modules/node-llama-cpp/dist/gguf/fileReaders/GgufFsFileReader.js:49:20)
    at async GgufFsFileReader.<anonymous> (file:///Users/devalias/dev/foohumanify/node_modules/node-llama-cpp/dist/gguf/fileReaders/GgufFsFileReader.js:44:40)
    at async withLock (file:///Users/devalias/dev/foohumanify/node_modules/lifecycle-utils/dist/withLock.js:36:16)
    at async GgufFsFileReader._readToExpandBufferUpToOffset (file:///Users/devalias/dev/foohumanify/node_modules/node-llama-cpp/dist/gguf/fileReaders/GgufFsFileReader.js:41:16)
    at async parseMagicAndVersion (file:///Users/devalias/dev/foohumanify/node_modules/node-llama-cpp/dist/gguf/parser/parseGguf.js:37:27)
    at async parseGguf (file:///Users/devalias/dev/foohumanify/node_modules/node-llama-cpp/dist/gguf/parser/parseGguf.js:11:29)
    at async readSingleFile (file:///Users/devalias/dev/foohumanify/node_modules/node-llama-cpp/dist/gguf/readGgufFileInfo.js:34:16)
    at async readGgufFileInfo (file:///Users/devalias/dev/foohumanify/node_modules/node-llama-cpp/dist/gguf/readGgufFileInfo.js:45:16)
    at async LlamaModel._create (file:///Users/devalias/dev/foohumanify/node_modules/node-llama-cpp/dist/evaluator/LlamaModel/LlamaModel.js:411:26) {
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: '/Users/devalias/.humanifyjs/models/Phi-3.1-mini-4k-instruct-Q4_K_M.gguf'
}

Node.js v22.6.0

It might be useful to give a more user friendly error that explains how to resolve the issue.

I ran into this while trying to test/replicate the following:

0xdevalias commented 3 months ago

I can see that the instructions are in the README here:

Which suggests I need to run humanify download 2b first.

I wonder if it might make more sense to have the local model download as a sub-command of humanify local, as that's where I was first looking for help for how to download the models, and it didn't even occur to me to check the root level command, since things local things seemed to be 'scoped' under the local command:

⇒ npx humanifyjs local -h
(node:97623) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
Usage: humanify local [options] <input>

Use a local LLM to unminify code

Arguments:
  input                     The input minified Javascript file

Options:
  -m, --model <model>       The model to use (default: "2b")
  -o, --outputDir <output>  The output directory (default: "output")
  -s, --seed <seed>         Seed for the model to get reproduceable results (leave out for random seed)
  --disableGpu              Disable GPU acceleration
  --verbose                 Show verbose output
  -h, --help                display help for command

There also seems to be very minimal information output during the download. It might be nice to know a bit more about which model is being downloaded, from where, where it's being saved, how large it is, etc:

 ⇒ npx humanifyjs download 2b
(node:97932) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
Downloaded 1.63 GB

I guess it does provide slightly more info when the download is completed:

⇒ npx humanifyjs download 2b
(node:97932) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
                  Model "2b" downloaded to /Users/devalias/.humanifyjs/models/Phi-3.1-mini-4k-instruct-Q4_K_M.gguf

I can see it's downloaded here:

⇒ ls ~/.humanifyjs/models
Phi-3.1-mini-4k-instruct-Q4_K_M.gguf

And the code for that is here:

https://github.com/jehna/humanify/blob/85d17e73d6b760bf896d01dd99fa3a5f98ea2848/src/local-models.ts#L13-L25

I also notice that MODEL_DIRECTORY is hardcoded currently. I wonder if that would be something useful to be able to specify/customize via a CLI arg/env variable/etc.

It seems the humanify local command uses getModelPath:

https://github.com/jehna/humanify/blob/85d17e73d6b760bf896d01dd99fa3a5f98ea2848/src/plugins/local-llm-rename/llama.ts#L19-L22

Which only seems to work for model aliases defined in MODELS:

https://github.com/jehna/humanify/blob/85d17e73d6b760bf896d01dd99fa3a5f98ea2848/src/local-models.ts#L69-L75

Even though the error text for humanify download sounds as though it would be capable of downloading any named model:

https://github.com/jehna/humanify/blob/85d17e73d6b760bf896d01dd99fa3a5f98ea2848/src/local-models.ts#L77-L85

And usually for LLM apps, the --model param would let us specify arbitrary models from huggingface or similar.


Edit: Created a new issue related to the download progress/etc: