Loading the model with quantized weights , two times corrupts the model

liuliu / s4nnc

Swift for NNC

https://libnnc.org

BSD 3-Clause "New" or "Revised" License

70 stars 8 forks source link

Loading the model with quantized weights , two times corrupts the model #18

Closed ghost closed 11 months ago

ghost commented 11 months ago

to reproduce

call the load weights function two times and run the model . you get NaNs. Does not happen with normal fp16/32 weights


graph.openStore(sdxl_model_path) {
    $0.read("unet", model: unet , codec: [.q6p, .q8p, .jit, .ezm7] )
  }

graph.openStore(sdxl_model_path) {
    $0.read("unet", model: unet , codec: [.q6p, .q8p, .jit, .ezm7] )
  }

ghost commented 11 months ago

what could be the possible problem and solution? Thanks

liuliu commented 11 months ago

Probably because unlike normal weights we allocated on nnc side and just read the blob in, for jit weights, we allocated them on s4nnc side: https://github.com/liuliu/s4nnc/blob/main/nnc/Store.swift#L2053

Workaround would be to create new model when you need to load the weights, but otherwise need to look into why this behavior (possible memory corruption) happens and how to fix them.

ghost commented 11 months ago

Okay thanks

liuliu commented 11 months ago

The limited case fixed in https://github.com/liuliu/s4nnc/commit/53f737c5d9b979b7f1ca9c864444ac882709b553

The reason it is limited because if the weight of the same name quantized differently (for example, once in q6p, another in q8p) it will still nan in the future.

ghost commented 10 months ago

Thanks