Open manishghop opened 9 months ago
Using python API, the process remains struck but while using cmd line args: $iree-compile ./chatglm2-6b-int4.mlir -o ./chatglm.vmfb
Can you attach the MLIR file to the issue? We need to decouple such issue from upstream projects as much as possible. I'm not able to look at this if they are at framework level.
The errors are all about stream dialect, so it does not seem like a codegen issue. We need some help for triaging this. @ScottTodd could you or others take a look at this?
I don't see any errors, just the module IR after dumping. We need the full console output (and reproducers).
@manishghop couple of things that can help...
1) You seem to have the .mlir
file but that is working fine from command line? If you can reproduce from command line, can you include the .mlir
file and the command line you need to invoke and reproduce the issue
2) I am not familiar with the Python flow as much, if it is really a difference of invoking from Python vs command line, then that will provide a clue of this being something else
1) Sharing .mlir
file is not possible right now as the linux system I was working in seems to be under OS installation upgrade and is inaccessible.
2) I do seem to have .mlir
file. For the conversion of .mlir
to .vmfb
when I tried to use python api it didn't seem to progress further so I had to kill the process, then I tried to use iree-compiler
command to check the output, it gave a big error in the terminal of which I shared only till I could scroll. This meant to me that something is off at the bytecode conversion.
- I do seem to have
.mlir
file. For the conversion of.mlir
to.vmfb
when I tried to use python api it didn't seem to progress further so I had to kill the process, then I tried to useiree-compiler
command to check the output, it gave a big error in the terminal of which I shared only till I could scroll. This meant to me that something is off at the bytecode conversion.
The conversion from .mlir
to .vmfb
is a compiler and not just a conversion. If you share the
a) MLIR file
b) the compile command
c) the log of the failure (you can just redirect the output to a file and share that)
we can help more.
@manishghop @ScottTodd
With all the before-iree error I fixed in https://github.com/llvm/torch-mlir/issues/2730 added,
I just run the export_chatglm2.py by python export_chatglm2.py
. It takes a while, but in the end it saved the mlir and vmfb successfully. Here is the cmd output I just got
[DEBUG] Compiling torchscript graph
[DEBUG] Lowering Torch -> Linalg
[DEBUG] Successfully Generated mlir on device
[DEBUG] converting to bytecode
Saved falcon mlir at chatglm-6b-int4.mlir
Compiling for device : cpu-task
Configuring for device:cpu-task
Target triple found:x86_64-linux-gnu
Saved vmfb in ./chatglm.vmfb.
Saved vic vmfb at ./chatglm.vmfb
You mentioned Using python API, the process remains struck
, but I guess you just need to wait a little longer or use powerful machine?
And the cmd iree-compile ./chatglm2-6b-int4.mlir -o ./chatglm.vmfb
you use, I think you probably need more flags to repeat the same functionality as the python code shark_module.save_module
did. It should be something looks like this:
iree-compile chatglm-6b-int4.mlir --iree-input-type=tm_tensor --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-linux-gnu --iree-llvmcpu-enable-ukernels --iree-llvmcpu-stack-allocation-limit=256000 --iree-global-opt-enable-quantized-matmul-reassociation --iree-stream-resource-max-allocation-size=4294967295 --iree-vm-bytecode-module-strip-source-map=true --iree-util-zero-fill-elided-attrs -o /tmp/chatglm.vmfb
The only error I saw is when I run the run_chatglm.py with the generated chatglm.vmfb
(shark.venv) ➜ chatglm python run_chatglm.py
/home/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
Loading module chatglm.vmfb...
[DEBUG] setting iree runtime flags for cpu:
--task_topology_max_group_count=30
--task_topology_max_group_count=30
[DEBUG] setting iree runtime flags for cpu:
--task_topology_max_group_count=30
Successfully Loaded vmfb model
Traceback (most recent call last):
File "/home/chi/src/test/chatglm/run_chatglm.py", line 109, in <module>
first_output = shark_module.forward(inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chi/src/SHARK/shark/shark_inference.py", line 159, in forward
return self.shark_runner.run(
^^^^^^^^^^^^^^^^^^^^^^
File "/home/chi/src/SHARK/shark/shark_runner.py", line 115, in run
return get_results(
^^^^^^^^^^^^
File "/home/chi/src/SHARK/shark/iree_utils/compile_utils.py", line 651, in get_results
result = compiled_vm[function_name](*device_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chi/src/iree-build/runtime/bindings/python/iree/runtime/function.py", line 137, in __call__
self._invoke(arg_list, ret_list)
File "/home/chi/src/iree-build/runtime/bindings/python/iree/runtime/function.py", line 162, in _invoke
self._vm_context.invoke(self._vm_function, arg_list, ret_list)
ValueError: Error invoking function: c/runtime/src/iree/modules/hal/utils/buffer_diagnostics.c:225: INVALID_ARGUMENT; input0 shape dimension 1 mismatch; expected 20 but have 9; expected shape `1x20`, actual shape `1x9`; while invoking native function hal.buffer_view.assert; while calling import;
[ 1] native hal.buffer_view.assert:0 -
[ 0] bytecode module@0:3570 -
Then I try to run it with different input size, everything looks good.
iree-run-module \
--device=local-task \
--module="chatglm.vmfb" \
--function=forward \
--input="1x20xi64=1"
EXEC @forward
result[0]: hal.buffer_view
1x20x65024xf16=[[-4.12891 -4.14062 7.33984 -3.08398 -4.09766 -4.10547 -4.11328 -4.125 -4.10547 -4.11328 -4.12109 -4.12891 -0.742188 11.8906 -4.10547 -4.13672 -4.13672 -4.11328 -4.08203 -4.10547 -4.14062 -4.16016 -4.12891 -4.13281 -4.125 -4.10547 -4.10547 -4.125 -4.14453 -4.13672 -4.10547 -4.12109 -6.19922 -4.11328 -4.61328 -4.125 -4.14062 -4.11719 -4.14062 -4.11719 -4.12891 -4.12109 -4.12109 -4.14453 -4.08984 -4.09766 -4.10547 -4.14453 -4.11719 -4.11719 -4.12891 -4.12891 -4.13281 -4.11328 -4.13672 -4.09766 -4.13672 -4.125 -4.11328 -4.10547 -4.11719 -4.14844 -4.11328 -4.14453 -4.13281 -4.12109 -4.10547 -4.14062 -4.10547 -4.14453 -4.11719 -4.11328 -4.13672 -4.125 -4.13672 -4.11719 -4.11719 -4.13281 -4.10547 -4.10547 -4.11328 -4.08203 -4.11328 -4.125 -4.125 -4.13281 -4.14453 -4.09766 -4.10547 -4.10547 -4.13672 -4.11719 -4.125 -4.13281 -4.10547 -4.11328 -4.125 -4.09766 -4.08984 -4.15625 -4.11719 -4.15234 -4.14844 -4.11328 -4.08203 -4.12891 -4.11719 -4.12109 -4.13281 -4.12109 -4.14844 -4.11328 -4.13281 -4.11328 -4.13672 -4.14062 -4.13281 -4.08203 -4.11719 -4.12891 -4.08984 -4.16016 -4.11719 -4.07422 -4.10547 -4.11719 -4.14062 -4.15234 -4.12109 -4.14453 -1.07324 -3.62109 -3.86719 -3.77734 -4.48047 -1.49023 -1.66211 -2.66602 -5.67578 -2.98438 -4.66406 -2.10547 1.69141 -1.63477 -3.11133 -3.4043 -0.387939 -4.53516 -6.19141 -6.1875 -5.22656 -4.08594 -2.99609 -2.45898 -2.29688 -5.84766 -2.66016 -6.15234 -1.90723 -4.03125 -5.35938 0.780762 -3.31836 -2.62695 -2.77734 -4.46875 -2.45312 -4.86328 -6.10938 -0.964844 -1.5332 -2.47266 -3.85938 -4.71875 -1.81055 2.75195 -0.764648 -6.00781 0.927734 -6.78125 -7.08203 -0.503906 0.166626 -3.86328 -2.63672 -1.32031 -0.80957 -4.70312 -3.45312 -0.811523 0.103271 -4.58203 -1.87012 -5.76562 -5.04297 -4.10547 -4.09766 -1.74023 -4.11719 -1.80078 -2.37109 -2.31445 -3.02539 -4.00391 -4.63281 -3.69922 -2.50781 -1.0127 -2.99609 -2.89648 -2.3418 -3.49219 -2.38867 -5.15234 -3.39453 -3.85352 -1.41699 -1.53027 -2.81836 -2.80664 -1.80078 -1.32227 -3.01953 -3.73438 -2.98047 -2.50977 -2.79102 -0.907227 -4.39062 -0.955078 -0.462402 0.169434 -0.181274 1.95215 -2.82031 -0.0273438 2.41211 -2.90625 -2.06836 -5.16016 -2.81055 -4.11328 -3.43164 1.24121 -2.48047 -2.82031 -3.44141 -1.66309 -4.13672 -4.10547 -4.16797 -4.12891 -4.11328 -4.11719 -4.12109 -4.13281 -4.15234 -4.12891 -4.09766 -4.26562 1.125 1.51465 -4.09375 -1.68652 -1.7627 0.76416 -1.76855 1.38086 0.515137 -1.62695 0.525391 0.232544 0.914551 -1.41211 -1.62402 -1.39062 -0.156616 0.0297852 2.91797 0.393555 1.9541 2.68164 -2.33594 0.412598 -0.88916 0.849121 3.37891 -2.50781 2.36719 1.18555 2.77539 3.56445 2.02344 0.189331 -1.50293 -0.179199 0.103455 -1.89746 1.59668 -0.558594 -2.58398 0.927246 -0.194946 1.7998 0.309814 0.42627 1.09766 2.65234 -1.75781 -4.62891 1.375 -2.28516 -0.136719 -2.31055 0.256104 0.998047 1.27148 2.27734 2.6875 3.3418 -3.94727 0.991699 -1.88477 0.747559 1.30566 -1.82422 -1.52246 -1.44824 0.184692 0.612793 1.74219 0.972656 1.00879 -2.59961 -1.76562 -0.925781 -5.89062 1.52246 -1.77539 3.7207 -3.20898 -4.5 -0.766602 -1.09473 -0.727539 2.61914 2.32227 3.38086 -0.686035 0.330322 0.0643921 -3.96484 -1.6543 0.791992 -0.239624 0.123718 1.75879 0.440674 1.10742 5.01953 1.13672 1.63086 3.21484 -3.07812 -2.71094 0.459961 -1.11426 5.89453 1.75293 -1.66992 2.74609 1.75098 -7.3125 -3.71484 0.982422 -0.10968 1.00195 -2.18555 3.32812 -1.27148 0.922363 -0.789062 -6.19922 1.27539 -1.53223 -4.23438 3.3457 -6.35938 3.29102 2.125 3.45703 3.09766 -4.42578 -1.59473 0.231445 -1.47363 -0.0170898 2.62695 -0.607422 4.59766 -1.29688 -0.409668 4.23438 -2.19336 -0.37793 -2.68945 -4.54688 0.55957 -5.70312 -2.78906 -1.41016 -2.24219 -1.18262 0.198975 -2.63086 -3.05469 -2.10156 -3.34766 -3.00586 2.19141 1.78027 -0.98877 2.51562 -2.20117 0.94043 -2.82031 -2.00391 -0.455811 3.16016 4.17188 -1.92578 -0.180542 1.84473 -0.26416 -0.493896 2.79883 -3.1582 -1.80176 -0.869629 -3.75 -2.60352 -0.0863037 -3.60156 -2.14258 -2.78906 3.27344 -1.45117 -2.00195 2.02539 -0.890625 -0.0408936 0.33252 -2.15234 -2.88867 -0.416992 5.35547 -3.36133 -0.19043 -0.318604 -0.67041 -8.0625 -0.990723 -2.35742 -0.716309 -2.92773 -6.16016 -1.58496 -1.52539 -0.351074 -1.70117 -4.22266 1.88574 1.35254 0.435791 -2.7207 -2.90234 -2.91797 -3.27539 0.380127 0.716309 -2.53711 0.784668 1.75586 -0.833984 -0.459961 0.619629 -1.10742 -1.67188 2.44531 -0.929199 0.439453 -2.1582 -1.78711 -0.53418 0.876953 0.414795 -1.3125 0.554688 1.03516 -1.12402 -2.41016 -4.59375 -2.61719 -5.42578 -1.36816 -4.3125 -1.77148 -10.1562 -5.38672 -0.949219 0.0123291 -2.51172 1.20898 -4.35156 0.47583 1.78711 -1.87305 -0.364746 0.0441284 -1.03125 -0.238525 -0.391357 -1.41504 1.54199 1.36816 -1.96387 0.0406189 1.69727 0.391113 -2.62695 -1.57129 -0.483887 -4.39453 -1.74609 0.0227051 1.54883 -2.86523 -2.77148 0.234009 1.89844 -1.80176 2.24609 2.66211 1.58301 0.434082 -4.84375 -0.602539 -6.04688 0.404053 -2.58594 3.83203 -4.54297 -3.47852 -0.636719 -1.47949 -2.51953 -2.63086 2.51953 3.01758 0.57959 5.32812 -1.88281 -0.182007 -2.86914 -5.875 2.85938 1.18262 0.214844 -5.14062 -1.01172 2.0625 0.0170898 -1.36035 -0.314453 -2.45898 -2.91406 -0.728516 2.08203 -1.68652 -2.69141 -3.90234 -2.38086 0.243652 -2.09375 -2.42188 -0.994141 -0.5625 0.678223 -1.00879 -3.71875 -1.06152 -2.98242 -0.316895 0.949707 -4.60156 0.850586 0.794922 0.743652 -0.786621 0.206543 0.656738 -0.821289 5.09766 -1.65625 -1.45996 -2.75586 -1.46289 -4.01172 -1.29883 0.352539 4.69922 2.7793 1.23145 0.0596313 1.13867 -1.74414 -2.89648 -6.28516 -4.30078 0.509277 -0.116333 3.63086 -1.97461 -0.105469 0.236816 -3.71875 1.7832 0.130981 -0.293701 0.183716 1.73438 0.504883 -3.41406 0.456055 -0.14856 1.93262 -2.28516 -3.75586 1.13477 -3.32617 2.25 -0.595215 -1.52441 -3.48438 -1.84375 1.84668 -1.37891 0.749023 -5.81641 4.11719 2.34375 2.05273 -2.66992 -3.19531 0.71582 -1.57812 1.68652 -0.107849 -1.21191 0.723633 -0.15332 -1.21484 -2.95312 -0.00750732 0.074707 1.08496 2.51172 -0.543945 -0.850098 -0.309082 -2.47852 0.82373 -0.30542 1.61133 0.519531 -1.96875 -1.1084 -2.89453 -0.095459 -1.38672 0.137451 -3.07812 2.73828 -3.31055 -1.99805 3.13086 0.701172 1.20508 -4.16016 1.54883 -2.08984 -1.58105 -0.593262 1.64453 -1.05469 -6.11328 -4.16797 1.55469 -0.0996094 -2.81055 -4.79297 2.68164 -2.58594 -6.44141 -0.349609 -0.455322 -2.49023 0.074707 -2.125 -0.371582 1.38867 2.30664 1.69141 0.000915527 -3.88477 0.216309 -0.464844 -3.77344 -1.00293 -2.95117 3.34375 -1.10547 2.4082 -0.250244 -2.53516 -0.682129 0.22229 1.67188 -1.25391 -1.52441 -0.232178 -2.02734 -0.887695 -1.51172 -0.882812 -2.61328 0.983887 -0.668945 -3.875 -5.47266 -2.0625 2.28125 -0.78125 -0.227051 -3.50391 -5.16016 4.39844 0.187744 5.25391 -4.15234 -2.24219 -0.80127 -4.67969 -3.38086 -0.875 -3.25586 -0.620605 -5.18359 1.01855 2.59961 -2.03125 0.242798 -2.20312 -1.53027 -3.29688 -2.89844 1.88965 -2.34375 -2.19727 0.662598 -6.08594 -1.38477 -4.38281 2.26367 -1.06836 0.489746 -0.767578 -1.91016 -3.74414 -1.83496 1.49316 2.44922 1.14648 2.23438 -1.66309 -3.79883 0.72998 -1.80664 -2.04297 -2.13281 2.1543 0.142212 -1.42383 1.94238 -2.2207 -1.9375 -2.16211 0.432861 -5.5625 -4.21875 -0.00601196 -4.71094 2.66016 -0.576172 -1.2002 -1.69824 1.4209 -3.10742 2.5332 2.41016 -1.18359 -1.44629 2.60938 0.693848 1.52246 -2.85352 0.24585 -0.420898 -3.18945 -3.47461 -3.13672 -0.206421 -0.649902 4.01953 0.75293 -0.179688 -0.124023 -2.10742 1.16309 -5.42188 0.347656 -2.23242 -2.30273 -0.587891 0.953613 -3.73047 -0.795898 -0.277832 -2.02734 0.230591 2.07031 -3.67188 -0.680664 1.58887 -1.93066 -3.45312 -2.96094 -2.67383 -0.89209 -2.91406 -4.22656 -0.54834 3.51562 0.202637 -0.577148 0.878906 -0.557617 0.794922 -2.04297 -5.19141 1.03223 -0.0549927 0.695312 0.785156 -1.36816 1.42676 0.433594 -4.19531 0.47876 -1.375 -2.68359 -1.82617 -2.92578 -2.30469 -5.30859 1.1123 0.52832 -2.56641 -0.852539 0.162964 -2.54492 -2.76367 -3.2832 0.522461 0.586914 -3.31641 -2.34961 2.57422 3.5332 -3.61914 -5.68359 -0.267578 -0.924805 -3.30078 -4.22656 -1.22559 2.0918 0.467285 -3.87305 -4.84375 -3.24414 0.160156 -0.950195 -0.326416 -5.50391 0.686523 3.19727 -3.92969 -3.03516 -1.71582 1.99121 -0.937012 -1.56641 1.78516 -0.06073 -0.128174 -0.220093 0.195679 3.61133 -4.41797 0.213623 2.32812 -1.92188 -3.58789 1.63672 -2.43555 0.426758 -2.70703 -1.39941 -2.29883 0.671387 -0.966309 3.74609 1.99609 -1.13477 -0.0583496 -3.61719 1.46484 0.584473 2.85156 -3.19141 -0.0786133 -2.03711 -1.51367 1.34082 -0.0135498 -0.208008 -7.35547 -0.663086 -1.42969 -1.19141 -2.37109 -4.375 -0.384033 -1.30371 0.739258 -2.39258 -1.21973 -1.13867 -3.79102 0.619629 -0.47168 1.14648 -3.58008 -2.5918 -0.320557 -0.655762 -5.77734 -1.70801 -2.73828 -4.13672 0.57373 2.51172 4.57031 -0.336914 -1.63281 1.0293 1.17676 -5.30078 1.10156 1.7959 -3.29297 1.11914 2.99023 -2.73438 -2.40039 5.03516 -4.73438 -0.481689 -2.46875 -0.947754 0.104004 0.912598 -0.313477 -0.79248 -2.31055 0.132202 -1.10547 0.871094 -3.62305 1.50879 0.538086 -4.74609 1.38184 -6.38281 -1.38867 0.118896 -3.26562 0.412354 -2.61914 -0.800781 -2.89844 3.38867 -4.44141 -0.980957 -0.175293 0.680664 -2.19141 -2.69141 -3.14648 1.79492 -4.32031 0.619141 0.713379...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...]]
I'm trying to get up to speed here. A few things stand out:
.mlir
/.mlirbc
file, flags used to compile, and any runtime code (C/Python/tools/etc.) used to load and run the compiled program. The current bug report issue template has a deliberate section for "Additional context" - please use that.--mlir-print-skip-regions --mlir-print-ir-before-all
(or similar) to get a semi-reasonable amount of logs showing what the compiler is working on and if it is stalled somewhere.Regarding version info, I'm trying to reproduce on my Windows dev machine now, and a few deps were missing. Please provide specific reproduction instructions for issues if you can.
λ python -m venv .venv
λ .venv\Scripts\activate.bat
(.venv) λ python -m pip install shark-turbine
(.venv) λ python -m pip show shark_turbine | grep Version
Version: 0.9.3
(.venv) λ python -m pip show torch | grep Version
Version: 2.1.2
(.venv) λ python .\export_chatglm2.py > export_chatglm2_output.txt
Traceback (most recent call last):
File "D:\dev\scratch\iree_2024_01_18\export_chatglm2.py", line 1, in <module>
from transformers import AutoTokenizer, AutoModel, AutoModelForCausalLM
ModuleNotFoundError: No module named 'transformers'
(.venv) λ python -m pip install transformers
(.venv) λ python -m pip show transformers | grep Version
Version: 4.36.2
(.venv) λ python .\export_chatglm2.py > export_chatglm2_output.txt
Traceback (most recent call last):
File "D:\dev\scratch\iree_2024_01_18\export_chatglm2.py", line 3, in <module>
import torch_mlir
ModuleNotFoundError: No module named 'torch_mlir'
Does SHARK-Turbine include torch_mlir somehow? Should I even try installing torch_mlir on its own, or could that lead to conflicts?
Also, instructions on https://github.com/llvm/torch-mlir look to be outdated, since Windows works too?
> At the time of writing, we release pre-built snapshot of torch-mlir for Python 3.11 on Linux and macOS.
(edit: sent https://github.com/llvm/torch-mlir/pull/2771 to tweak those instructions)
(.venv) λ python -m pip install torch-mlir -f https://llvm.github.io/torch-mlir/package-index/
(.venv) λ python -m pip show torch_mlir | grep Version
Version: 20240118.1087
(.venv) λ python .\export_chatglm2.py > export_chatglm2_output.txt
D:\dev\scratch\iree_2024_01_18\.venv\Lib\site-packages\transformers\utils\generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
Traceback (most recent call last):
File "D:\dev\scratch\iree_2024_01_18\export_chatglm2.py", line 12, in <module>
from shark.shark_downloader import download_public_file
ModuleNotFoundError: No module named 'shark'
Looks like I need shark from https://github.com/nod-ai/SHARK too?
@ScottTodd you need shark.venv from https://github.com/nod-ai/SHARK not SHARK-Turbine env. The Shark-Turbine us the torch-mlir as subproject in iree so it won't need to torch-mlir python package installed. But the export_chatglm.py was developed on SHARK, which use torch-mlir and iree python package separately when it was designed.
To initialized the shark.venv on ubuntu, use
git clone https://github.com/nod-ai/SHARK
cd SHARK
PYTHON=python3.11 ./setup_venv.sh
source ./SHARK/shark.venv/bin/activate
Still trying to repro, and version information is still needed. SHARK's setup instructions are confusing... there are multiple paths through them and the venv setup scripts are not portable across operating systems or directories:
setup_venv.sh
fails for several reasons on my Windows machine under bash with https://cmder.app/. I don't have 'python3', just 'python'. My 'python' path has spaces in it (C:\Program Files\Python311\python.exe
). Venv activation with source "$VENV_DIR/bin/activate"
does not work on Windows (run $VENV_DIR\Scripts\activate.bat
instead).setup_venv.ps1
looks like it needs to run in the source directory, but I'd rather set up a virtual environment in an external temp dir...Probably worth asking/tracking that on the shark project.
Got a venv setup and tried running the script...
D:\dev\projects\SHARK\shark.venv\Lib\site-packages\torch\utils\_pytree.py:255: UserWarning: <class 'torch.Size'> is already registered as pytree node. Overwriting the previous registration.
warnings.warn(
Traceback (most recent call last):
File "D:\dev\scratch\iree_2024_01_19\export_chatglm2.py", line 121, in <module>
ts_graph = import_with_fx(
^^^^^^^^^^^^^^^
File "D:\dev\projects\SHARK\shark\shark_importer.py", line 697, in import_with_fx
from brevitas_examples.llm.llm_quant.sharded_mlir_group_export import (
File "D:\dev\projects\SHARK\shark.venv\Lib\site-packages\brevitas_examples\llm\llm_quant\sharded_mlir_group_export.py", line 58, in <module>
from brevitas_examples.llm.llm_quant.mlir_custom_mm import brevitas_matmul_rhs_group_quant_library
File "D:\dev\projects\SHARK\shark.venv\Lib\site-packages\brevitas_examples\llm\llm_quant\mlir_custom_mm.py", line 12, in <module>
from torch_mlir.dialects.torch.importer.jit_ir.build_tools.registry import \
ModuleNotFoundError: No module named 'torch_mlir.dialects.torch.importer'
Full logs here: https://gist.github.com/ScottTodd/9c4c62170ea7f1be5088def18cf553ea
Can someone who ran this extract and share the .mlir, or at least provide specific repro instructions? Ideally repro instructions would use published Python packages... these local builds are really unstable and difficult to pin down across systems.
Got a venv setup and tried running the script...
D:\dev\projects\SHARK\shark.venv\Lib\site-packages\torch\utils\_pytree.py:255: UserWarning: <class 'torch.Size'> is already registered as pytree node. Overwriting the previous registration. warnings.warn( Traceback (most recent call last): File "D:\dev\scratch\iree_2024_01_19\export_chatglm2.py", line 121, in <module> ts_graph = import_with_fx( ^^^^^^^^^^^^^^^ File "D:\dev\projects\SHARK\shark\shark_importer.py", line 697, in import_with_fx from brevitas_examples.llm.llm_quant.sharded_mlir_group_export import ( File "D:\dev\projects\SHARK\shark.venv\Lib\site-packages\brevitas_examples\llm\llm_quant\sharded_mlir_group_export.py", line 58, in <module> from brevitas_examples.llm.llm_quant.mlir_custom_mm import brevitas_matmul_rhs_group_quant_library File "D:\dev\projects\SHARK\shark.venv\Lib\site-packages\brevitas_examples\llm\llm_quant\mlir_custom_mm.py", line 12, in <module> from torch_mlir.dialects.torch.importer.jit_ir.build_tools.registry import \ ModuleNotFoundError: No module named 'torch_mlir.dialects.torch.importer'
Full logs here: https://gist.github.com/ScottTodd/9c4c62170ea7f1be5088def18cf553ea
Can someone who ran this extract and share the .mlir, or at least provide specific repro instructions? Ideally repro instructions would use published Python packages... these local builds are really unstable and difficult to pin down across systems.
Fixed here https://github.com/llvm/torch-mlir/issues/2730#issuecomment-1896442202
I'm not sure I'd call a local patch to a .venv folder a "fix"... is there a stable version somewhere I could use? (Also, again - please share the .mlir and/or more specific repro instructions, this is a ton of back and forth for basic issue reporting and triage).
If you have the shark.venv set up and activated already, you can extract the version information with:
pip list | grep iree-
printf "shark SHA: %s\n" "$(git log --pretty=format:'%H' -n 1)"
printf "iree-compile SHA: %s\n" "$(python -c "import iree.compiler.version as v; print(v.REVISIONS['IREE'])")"
printf "iree-runtime SHA: %s\n" "$(python -c "import iree.runtime.version as v; print(v.REVISIONS['IREE'])")"
What happened?
Model: https://huggingface.co/THUDM/chatglm2-6b
Issue: Error while using padding to the input tokens.
As in the inferencing, the output_token is appended to the input_ids in the next forward pass, the length of the input_ids in each subsequent forward pass increases by 1. To fix it, we tried to use padding but with several combinations of max_length (1,5,10,15,20). The execution flow breaks while converting .mlir to .vmfb.
The code to compile pytorch model to .mlir & .vmfb: export_chatglm2.py
This what we are trying to replicate(This is without using shark) : chatglm.py Using this as a reference, we are trying to do replicate this using nod ai shark.
This is what we aim to do using shark:run_chatglm.py. Ideally we want the vmfb model account for different sized prompts and not be bounded to always have a fixed shape, which currently is not possible.
Reason: During the inferencing, while generating new tokens we append the predicted tokens in the first forward pass to the input_ids for the subsequent forward passes. Initial code didn’t accounted for dynamic changes to the shape of input_ids, it expects the shape for input_ids which was passed while torch_mlir compilation. For ex: for “What is the capital of Canada?” -> shape: (1,9). While inferencing, it expects the shape for input_ids to always be (1,9) but in the second forward pass onwards the shape of input_ids will increase by 1 till we reach the stopping criteria, hence it pops an error.
Using python API, the process remains struck but while using cmd line args: $iree-compile ./chatglm2-6b-int4.mlir -o ./chatglm.vmfb The error we get is :
Steps to reproduce your issue
1) Run (export_chatglm2.py) to compile pytorch model to .mlir & .vmfb.
What component(s) does this issue relate to?
Compiler, Runtime