google-ai-edge / ai-edge-torch

Supporting PyTorch models with the Google AI Edge TFLite runtime.
Apache License 2.0
378 stars 51 forks source link

Llama3 Conversion to TFLite version gets prematurely killed when running the convert_to_tflite.py script #300

Open Arya-Hari opened 1 month ago

Arya-Hari commented 1 month ago

Description of the bug:

I installed the library and all the requirements for trying out the Llama 3 1B model conversion to TFLite format. However, on running the convert_to_tflite.py script I always end up the process getting killed. Any idea why? This is what get printed on the console. Btw, I'm working on WSL.

2024-10-17 14:06:29.116876: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variableTF_ENABLE_ONEDNN_OPTS=0. 2024-10-17 14:06:29.978986: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. /home/venv/lib/python3.11/site-packages/torch/_subclasses/functional_tensor.py:362: UserWarning: At pre-dispatch tracing, we will assume that any custom op that is marked with CompositeImplicitAutograd and functional are safe to not decompose. We found xla.mark_tensor.default to be one such op. warnings.warn( /home/venv/lib/python3.11/site-packages/torch/_subclasses/functional_tensor.py:362: UserWarning: At pre-dispatch tracing, we will assume that any custom op that is marked with CompositeImplicitAutograd and functional are safe to not decompose. We found xla.mark_tensor.default to be one such op. warnings.warn( WARNING:root:PJRT is now the default runtime. For more information, see https://github.com/pytorch/xla/blob/master/docs/pjrt.md WARNING:root:Defaulting to PJRT_DEVICE=CPU WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1729174062.687510 372 cpu_client.cc:467] TfrtCpuClient created. Killed

Actual vs expected behavior:

Expected Behavior : It should ideally run without issues and create a tflite file after conversion. Actual Behavior : Conversion process gets prematurely killed.

Any other information you'd like to share?

No response

haozha111 commented 1 month ago

The converter may need access 3x model weight's RAM on your machine, what's your machine's CPU RAM size?

Arya-Hari commented 1 month ago

Hello. My machine has 16 GB of installed RAM.

haozha111 commented 1 month ago

I see. The current conversion might require you to have a machine w/ 32 GB Ram, we are working on improvements.

Arya-Hari commented 1 month ago

Okay I see. Are there any cloud-based alternatives that I can use to run the scripts?

haozha111 commented 1 month ago

yes, can you try colab pro? or if you have a remote cloud instance that has sufficient memory, that will work too.