Llama3 Conversion to TFLite version gets prematurely killed when running the convert_to_tflite.py script

Arya-Hari commented 1 month ago

Description of the bug:

I installed the library and all the requirements for trying out the Llama 3 1B model conversion to TFLite format. However, on running the convert_to_tflite.py script I always end up the process getting killed. Any idea why? This is what get printed on the console. Btw, I'm working on WSL.

2024-10-17 14:06:29.116876: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variableTF_ENABLE_ONEDNN_OPTS=0. 2024-10-17 14:06:29.978986: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. /home/venv/lib/python3.11/site-packages/torch/_subclasses/functional_tensor.py:362: UserWarning: At pre-dispatch tracing, we will assume that any custom op that is marked with CompositeImplicitAutograd and functional are safe to not decompose. We found xla.mark_tensor.default to be one such op. warnings.warn( /home/venv/lib/python3.11/site-packages/torch/_subclasses/functional_tensor.py:362: UserWarning: At pre-dispatch tracing, we will assume that any custom op that is marked with CompositeImplicitAutograd and functional are safe to not decompose. We found xla.mark_tensor.default to be one such op. warnings.warn( WARNING:root:PJRT is now the default runtime. For more information, see https://github.com/pytorch/xla/blob/master/docs/pjrt.md WARNING:root:Defaulting to PJRT_DEVICE=CPU WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1729174062.687510 372 cpu_client.cc:467] TfrtCpuClient created. Killed

Actual vs expected behavior:

Expected Behavior : It should ideally run without issues and create a tflite file after conversion. Actual Behavior : Conversion process gets prematurely killed.

Any other information you'd like to share?

No response

haozha111 commented 1 month ago

The converter may need access 3x model weight's RAM on your machine, what's your machine's CPU RAM size?

Arya-Hari commented 1 month ago

Hello. My machine has 16 GB of installed RAM.

haozha111 commented 1 month ago

I see. The current conversion might require you to have a machine w/ 32 GB Ram, we are working on improvements.

Arya-Hari commented 1 month ago

Okay I see. Are there any cloud-based alternatives that I can use to run the scripts?

haozha111 commented 1 month ago

yes, can you try colab pro? or if you have a remote cloud instance that has sufficient memory, that will work too.

google-ai-edge / ai-edge-torch