Open ongiaf opened 9 months ago
The full onnx model can be download from here:
@ongiaf Do you have any success decoding FuXi (I've been fine-tuning this model for a long time)? I recommend paying attention to this solution
Thanks, it's excellent work.
And with some dirty work, Fuxi can successfully run on PyTorch with Onnx2Torch
. In Onnx2Torch
, problems are mainly about LayerNormalization
and Clip
.
@ongiaf Did you manage to run FuXi with the current weights for the fine-tuning process (I am currently thinking about how to complete the work on the model on a 1-hour grid and thought about freezing some layers except the U-transformer.) ?
Thanks, it's excellent work. And with some dirty work, Fuxi can successfully run on PyTorch with
Onnx2Torch
. InOnnx2Torch
, problems are mainly aboutLayerNormalization
andClip
.
Thank you for posting your changes about Clip. Could you also suggest how to fix LayerNormalization
? Looks like the converted model has issue with torch.layer_norm
call.
@juanqiu1 In order for this to work with FuXi, you will need to change the onnx2torch/node_converters/layer_norm.py
parameter to [1536]:
@add_converter(operation_type='LayerNormalization', version=17)
def _(node: OnnxNode, graph: OnnxGraph) -> OperationConverterResult:
node_attributes = node.attributes
axis = node_attributes.get('axis', AXIS_DEFAULT_VALUE)
epsilon = node_attributes.get('epsilon', EPSILON_DEFAULT_VALUE)
if all(value_name in graph.initializers for value_name in node.input_values[1:]):
input_value_info = graph.value_info[node.input_values[0]]
input_shape = get_shape_from_value_info(input_value_info)
torch_module = nn.LayerNorm(
normalized_shape=(1536), # input_shape[axis:], (this block!)
eps=epsilon,
elementwise_affine=True,
)
@dsuhoi Thank you for hinting, there are a couple of other easy fixes (typing, etc).
Did you manage to run FuXi with the current weights for the fine-tuning process
Do you have any progress on that? After conversion, I loaded model into pytorch but even on A100 with FSDP enabled via accelerate
. I still get CUDA out-of-memory error.
@juanqiu1 Yes, I managed to start the learning process by highlighting the named_parameters()
part within the last dozen UTransformer (this was enough for fine-tunning).
I used Nvidia A100 (40GB).
Hi, I have an onnx model. Here is one of the nodes in onnxgraph
When I tried to convert it to the torch model, it cased a KeyError:
it may caused by https://github.com/ENOT-AutoDL/onnx2torch/blob/a8b060336c8c95c51a6257a8d99171f0b86b8eab/onnx2torch/node_converters/clip.py#L60
After adding conditions
The convert can work.