seanshpark / netron

Visualizer for deep learning and machine learning models
https://www.lutzroeder.com/ai
MIT License
0 stars 0 forks source link

Skip loading weights ? #3

Open seanshpark opened 1 month ago

seanshpark commented 1 month ago

To load LARGE models ! like 3B models with 12GB !

seanshpark commented 1 month ago
diff --git a/source/onnx-proto.js b/source/onnx-proto.js
index b310a795..c1257d9d 100644
--- a/source/onnx-proto.js
+++ b/source/onnx-proto.js
@@ -782,7 +782,7 @@ onnx.TensorProto = class TensorProto {
                     message.raw_data = reader.bytes();
                     break;
                 case 13:
-                    message.external_data.push(onnx.StringStringEntryProto.decode(reader, reader.uint32()));
+                    reader.skip(reader.uint32());
                     break;
                 case 14:
                     message.data_location = reader.int32();
seanshpark commented 4 weeks ago

Netron as Windows App

"onnx.ProtoReader.read(): 41388ms"

Netron inside Edge browser

"onnx.ProtoReader.read(): 573ms"

um?

seanshpark commented 4 weeks ago

what takes so long

in source/onnx.js

const promises = keys.map((location) => this.context.fetch(location));
const streams = await Promise.all(promises.map((promise) => promise.then((context) => context.stream).catch(() => null)));

where fetch() is in source/view.js

    async fetch(file) {
        const stream = await this._context.request(file, null, this._base);
        return new view.Context(this, file, stream, new Map());
    }

why is browser fast?

test Netron App with single .circle model

conclusion

seanshpark commented 4 weeks ago

can we force skip loading external weight files?

diff --git a/source/onnx.js b/source/onnx.js
index a6adac9a..9645ace1 100644
--- a/source/onnx.js
+++ b/source/onnx.js
@@ -1656,12 +1656,6 @@ onnx.ProtoReader = class {
             if (onnx.proto && tensor instanceof onnx.proto.SparseTensorProto) {
                 location(tensor.indices);
                 location(tensor.indices);
-            } else if (tensor.data_location === onnx.DataLocation.EXTERNAL && Array.isArray(tensor.external_data)) {
-                for (const entry of tensor.external_data) {
-                    if (entry.key === 'location') {
-                        locations.add(entry.value);
-                    }
-                }
             }
         };
         const model = this.model;

before: about 60 seconds in my Windows 10 PC after: about 11 seconds, with both App and Browser