ml-explore / mlx

MLX: An array framework for Apple silicon
https://ml-explore.github.io/mlx/
MIT License
14.83k stars 845 forks source link

feat: metal formatting and pre-commit bump #1038

Closed NripeshN closed 2 weeks ago

NripeshN commented 3 weeks ago

Add metal formatting functionality

awni commented 3 weeks ago

Nice! I'm glad pre-commit finally supports .metal, that's awesome!

Could you fix the conflicts and some of the badly formatted kernel instantiations sections? Then we can merge it in.

NripeshN commented 3 weeks ago

Hey @awni, Any idea why the test is failing? I can't test anything locally until this is fixed https://github.com/ml-explore/mlx/issues/1045 :(

awni commented 3 weeks ago

Looks that kernel isn't being instantiated. Probably you deleted the line that instantiates it or something like that.

[metal::Device] Unable to load kernel vselectfloat32
awni commented 3 weeks ago

Could you share the command you are running? Also what OS / machine?

NripeshN commented 3 weeks ago

Could you share the command you are running? Also what OS / machine?

LOW_MEMORY=1 DEVICE=gpu METAL_DEVICE_WRAPPER_TYPE=1 METAL_DEBUG_ERROR_MODE=0 python -m xmlrunner discover -v python/tests -o test-results/gpu Copied it from the CI 😅

Using MacOS 14.3, M2 max MacBook pro

awni commented 3 weeks ago

The command should be fine. I don't know why it's not working for you.

For some reason Metal isn't JIT compiling for you. It would be useful to know if you try a fresh build on main if it works or not.

NripeshN commented 3 weeks ago

The command should be fine. I don't know why it's not working for you.

For some reason Metal isn't JIT compiling for you. It would be useful to know if you try a fresh build on main if it works or not.

By fresh build you mean you want me to delete the venv and create a new one right? I have tried that multiple times. Adding the line PATCH_COMMAND /usr/bin/patch -N -i ${METAL_CPP_PATCH} || true fixed the installation but I am still running into this error while running unit tests.

In the meanwhile can you please run CI on the latest commit

awni commented 3 weeks ago

By fresh build you mean you want me to delete the venv and create a new one right?

I meant try a fresh git clone and build. I don't think it's related to your virtual environment.

NripeshN commented 3 weeks ago

By fresh build you mean you want me to delete the venv and create a new one right?

I meant try a fresh git clone and build. I don't think it's related to your virtual environment.

Just did that, still facing the same issue

awni commented 3 weeks ago

How are you building the library? Do any other tests work other than the compile ones?

NripeshN commented 3 weeks ago

How are you building the library? Do any other tests work other than the compile ones?

I am building it using this command

env CMAKE_BUILD_PARALLEL_LEVEL="" pip install -e .

Yes several other tests pass

(venv) nripeshniketan@Nripeshs-MBP ~/D/p/mlx (main)> LOW_MEMORY=1 DEVICE=gpu METAL_DEVICE_WRAPPER_TYPE=1 METAL_DEBUG_ERROR_MODE=0 pytho
n -m xmlrunner discover -v python/tests -o test-results/gpu
2024-04-28 05:24:55.431 Python[41150:214436] Metal API Validation Enabled

Running tests...
----------------------------------------------------------------------
  test_api (test_array.TestArray) ... ok (0.034s)
  test_array_at (test_array.TestArray) ... ok (0.014s)
  test_array_basics (test_array.TestArray) ... ok (0.002s)
  test_array_comparison (test_array.TestArray) ... ok (0.004s)
  test_array_copy (test_array.TestArray) ... ok (0.030s)
  test_array_iteration (test_array.TestArray) ... ok (0.001s)
  test_array_neg (test_array.TestArray) ... ok (0.001s)
  test_array_np_conversion (test_array.TestArray) ... ok (0.001s)
  test_array_np_dtype_conversion (test_array.TestArray) ... ok (0.006s)
  test_array_np_shape_dim_check (test_array.TestArray) ... ok (0.000s)
  test_array_pickle (test_array.TestArray) ... ok (0.006s)
  test_array_repr (test_array.TestArray) ... ok (0.001s)
  test_array_to_list (test_array.TestArray) ... ok (0.001s)
  test_array_type_cast (test_array.TestArray) ... ok (0.001s)
  test_array_view_ref_counting (test_array.TestArray) ... ok (0.000s)
  test_bool_conversion (test_array.TestArray) ... ok (0.000s)
  test_buffer_protocol (test_array.TestArray) ... ok (0.017s)
  test_buffer_protocol_ref_counting (test_array.TestArray) ... ok (0.000s)
  test_buffer_protocol_tf (test_array.TestArray) ... ok (0.019s)
  test_construction_from_lists (test_array.TestArray) ... ok (0.002s)
  test_construction_from_lists_of_mlx_arrays (test_array.TestArray) ... ok (0.174s)
  test_dtype_promotion (test_array.TestArray) ... ok (0.002s)
  test_dtype_python_scalar_promotion (test_array.TestArray) ... ok (0.001s)
  test_indexing (test_array.TestArray) ... ok (0.214s)
  test_init_from_array (test_array.TestArray) ... ok (0.001s)
  test_inplace (test_array.TestArray) ... ok (0.006s)
  test_inplace_preserves_ids (test_array.TestArray) ... ok (0.000s)
  test_load_from_pickled_np (test_array.TestArray) ... ok (0.001s)
  test_logical_overloads (test_array.TestArray) ... ok (0.001s)
  test_memoryless_copy (test_array.TestArray) ... ok (0.001s)
  test_np_array_conversion_copies_by_default (test_array.TestArray) ... ok (0.000s)
  test_setitem (test_array.TestArray) ... ok (0.016s)
  test_slice_negative_step (test_array.TestArray) ... ok (0.004s)
  test_dtypes (test_array.TestDtypes) ... ok (0.001s)
  test_scalar_conversion (test_array.TestDtypes) ... ok (0.001s)
  test_array_eq_array (test_array.TestEquality) ... ok (0.001s)
  test_array_eq_scalar (test_array.TestEquality) ... ok (0.002s)
  test_list_equals_array (test_array.TestEquality) ... ok (0.000s)
  test_tuple_equals_array (test_array.TestEquality) ... ok (0.000s)
  test_array_ne_array (test_array.TestInequality) ... ok (0.001s)
  test_array_ne_scalar (test_array.TestInequality) ... ok (0.001s)
  test_invalid_op_on_array (test_array.TestInequality) ... ok (0.001s)
  test_list_not_equals_array (test_array.TestInequality) ... ok (0.000s)
  test_obj_inequality_array (test_array.TestInequality) ... ok (0.001s)
  test_tuple_not_equals_array (test_array.TestInequality) ... ok (0.000s)
  test_version (test_array.TestVersion) ... ok (0.000s)
  test_auxiliary_values (test_autograd.TestAutograd) ... ok (0.003s)
  test_captured (test_autograd.TestAutograd) ... ok (0.001s)
  test_eval_in_grad (test_autograd.TestAutograd) ... ok (0.001s)
  test_grad (test_autograd.TestAutograd) ... ok (0.003s)
  test_grad_kwargs (test_autograd.TestAutograd) ... ok (0.006s)
  test_grad_trees (test_autograd.TestAutograd) ... ok (0.007s)
  test_jvp (test_autograd.TestAutograd) ... ok (0.003s)
  test_power_grad (test_autograd.TestAutograd) ... ok (0.001s)
  test_scatter_max_vjp (test_autograd.TestAutograd) ... ok (0.006s)
  test_scatter_min_vjp (test_autograd.TestAutograd) ... ok (0.005s)
  test_scatter_vjp (test_autograd.TestAutograd) ... ok (0.003s)
  test_split_against_slice (test_autograd.TestAutograd) ... ok (0.002s)
  test_stop_gradient (test_autograd.TestAutograd) ... ok (0.004s)
  test_update_state (test_autograd.TestAutograd) ... ok (0.001s)
  test_vjp (test_autograd.TestAutograd) ... ok (0.002s)
  test_vjp_types (test_autograd.TestAutograd) ... ok (0.000s)
  test_arg_reduction_ops (test_bf16.TestBF16) ... ok (0.006s)
  test_binary_ops (test_bf16.TestBF16) ... ok (0.028s)
  test_blas_ops (test_bf16.TestBF16) ... ok (0.034s)
  test_reduction_ops (test_bf16.TestBF16) ... ok (0.054s)
  test_unary_ops (test_bf16.TestBF16) ... ok (0.022s)
  test_addmm (test_blas.TestBlas) ... ok (0.063s)
  test_addmm_grad (test_blas.TestBlas) ... ok (0.019s)
  test_block_masked_matmul (test_blas.TestBlas) ... ok (0.071s)
  test_empty_matmul (test_blas.TestBlas) ... ok (0.001s)
  test_matmul (test_blas.TestBlas) ... ok (0.004s)
  test_matmul_batched (test_blas.TestBlas) ... ok (0.042s)
  test_matmul_dtypes (test_blas.TestBlas) ... ok (0.004s)
  test_matmul_shapes (test_blas.TestBlas) ... ok (2.921s)
  test_matmul_unaligned (test_blas.TestBlas) ... ok (0.165s)
  test_matrix_vector (test_blas.TestBlas) ... ok (3.517s)
  test_matrix_vector_attn (test_blas.TestBlas) ... ok (0.081s)
  test_matrix_vector_batched (test_blas.TestBlas) ... ok (0.048s)
  test_matrix_vector_broadcast (test_blas.TestBlas) ... ok (0.007s)
  test_matrix_vector_edgecases (test_blas.TestBlas) ... ok (0.054s)
  test_compile_broadcast_only (test_compile.TestCompile) ... libc++abi: terminating due to uncaught exception of type std::runtime_error: [metal::Device] Unable to load build metal library from source
program_source:13:17: error: use of undeclared identifier 'Add'
  float tmp_B = Add()(tmp_C, tmp_C);
                ^
program_source:27:17: error: use of undeclared identifier 'Add'
  float tmp_B = Add()(tmp_C, tmp_C);
                ^
program_source:41:17: error: use of undeclared identifier 'Add'
  float tmp_B = Add()(tmp_C, tmp_C);
                ^
program_source:55:17: error: use of undeclared identifier 'Add'
  float tmp_B = Add()(tmp_C, tmp_C);
                ^
program_source:69:17: error: use of undeclared identifier 'Add'
  float tmp_B = Add()(tmp_C, tmp_C);
                ^
program_source:83:17: error: use of undeclared identifier 'Add'
  float tmp_B = Add()(tmp_C, tmp_C);
                ^
program_source:97:17: error: use of undeclared identifier 'Add'
  float tmp_B = Add()(tmp_C, tmp_C);
                ^
program_source:111:17: error: use of undeclared identifier 'Add'
  float tmp_B = Add()(tmp_C, tmp_C);
                ^
program_source:126:17: error: use of undeclared identifier 'Add'
  float tmp_B = Add()(tmp_C, tmp_C);
                ^

fish: Job 1, 'LOW_MEMORY=1 DEVICE=gpu METAL_D…' terminated by signal SIGABRT (Abort)