$ python3 to_hf_weights.py --input-ckpt gs://danielk-files/gpt-j-checkpoints_slim/step_318500 --config configs/6B_roto_256.json --output-path gs://danielk-files/danielk-files/gpt-j-checkpoints_slim_hf/step_318500 --dtype fp32
to_hf_weights.py:101: UserWarning: WARNING: Dtype support other than fp16 is Experimental. Make sure to check weights after conversion to make sure dtype information is retained.
warnings.warn(
WARNING:absl:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
/Users/danielk/opt/anaconda3/envs/jax_py38/lib/python3.8/site-packages/jax/experimental/maps.py:412: UserWarning: xmap is an experimental feature and probably has bugs!
warn("xmap is an experimental feature and probably has bugs!")
key shape (1, 2)
in shape (1, 2048)
dp 1
mp 1
Total parameters: 6050886880
Reading and transforming layers/shards. This may take a while.
Reading/Transforming Layers: 0%|▋ | 1/287 [00:26<2:04:31, 26.12s/it]to_hf_weights.py:384: UserWarning: The use of `x.T` on tensors of dimension other than 2 to reverse their shape is deprecated and it will throw an error in a future release. Consider `x.mT` to transpose batches of matricesor `x.permute(*torch.arange(x.ndim - 1, -1, -1))` to reverse the dimensions of a tensor. (Triggered internally at /Users/distiller/project/pytorch/aten/src/ATen/native/TensorShape.cpp:2318.)
x = torch.tensor(x.squeeze(0), dtype=torch_dtype).T
Reading/Transforming Layers: 1%|█▉ | 3/287 [00:26<41:19, 8.73s/it]
Traceback (most recent call last):
File "to_hf_weights.py", line 488, in <module>
save_sharded_to_hf_format(input_ckpt, params, output_path, np_dtype, torch_dtype)
File "to_hf_weights.py", line 466, in save_sharded_to_hf_format
save_pytree_as_hf(
File "to_hf_weights.py", line 382, in save_pytree_as_hf
x = unshard_leave(x, leave_name, old_shape, np_dtype=np_dtype)
File "to_hf_weights.py", line 324, in unshard_leave
x = reshard(
File "to_hf_weights.py", line 244, in reshard
out = np.reshape(x, old_shape)
File "<__array_function__ internals>", line 5, in reshape
File "/Users/danielk/opt/anaconda3/envs/jax_py38/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 299, in reshape
return _wrapfunc(a, 'reshape', newshape, order=order)
File "/Users/danielk/opt/anaconda3/envs/jax_py38/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 58, in _wrapfunc
return bound(*args, **kwds)
ValueError: cannot reshape array of size 25804800 into shape (1,4096,50400)
Here is my environment:
and
On: MacOS Catalina (v10.15.7)