mlcommons / chakra

Repository for MLCommons Chakra schema and tools
https://mlcommons.org/working-groups/research/chakra/
Apache License 2.0
45 stars 17 forks source link

Can't convert text use et.converter #75

Open lulala-s opened 1 month ago

lulala-s commented 1 month ago

Describe the Bug

When I want to convert the Resnet_Dataparallel.txt use the following command

python3 -m chakra.et_converter.et_converter\
    --input_type Text\
    --input_filename  ../../inputs/workload/ASTRA-sim-1.0/Resnet50_DataParallel.txt\
    --output_filename ../../outputs/convert_result/Resnet50_DataParallel\
    --num_npus 4\
    --num_dims 2\
    --num_passes 1

a mistake happens

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/chakra/et_converter/et_converter.py", line 106, in main
    converter.convert()
  File "/usr/local/lib/python3.8/dist-packages/chakra/et_converter/text2chakra_converter.py", line 147, in convert
    self.convert_data_parallel(f, num_layers)
  File "/usr/local/lib/python3.8/dist-packages/chakra/et_converter/text2chakra_converter.py", line 202, in convert_data_parallel
    self.add_parent(fwd_comp_node, layers[idx-1].fwd_comp_node)
  File "/usr/local/lib/python3.8/dist-packages/chakra/et_converter/text2chakra_converter.py", line 136, in add_parent
    child_node.parent.append(parent_node.id)
AttributeError: parent

and then I search the file "text2chakra_converter.py"and "et_def_pb2.pyi" ,I find there is no attribute named parent in Node,the defination is there

class Node(_message.Message):
    __slots__ = ("id", "name", "type", "ctrl_deps", "data_deps", "start_time_micros", "duration_micros", "inputs", "outputs", "attr")
    ID_FIELD_NUMBER: _ClassVar[int]
    NAME_FIELD_NUMBER: _ClassVar[int]
    TYPE_FIELD_NUMBER: _ClassVar[int]
    CTRL_DEPS_FIELD_NUMBER: _ClassVar[int]
    DATA_DEPS_FIELD_NUMBER: _ClassVar[int]
    START_TIME_MICROS_FIELD_NUMBER: _ClassVar[int]
    DURATION_MICROS_FIELD_NUMBER: _ClassVar[int]
    INPUTS_FIELD_NUMBER: _ClassVar[int]
    OUTPUTS_FIELD_NUMBER: _ClassVar[int]
    ATTR_FIELD_NUMBER: _ClassVar[int]
    id: int
    name: str
    type: NodeType
    ctrl_deps: _containers.RepeatedScalarFieldContainer[int]
    data_deps: _containers.RepeatedScalarFieldContainer[int]
    start_time_micros: int
    duration_micros: int
    inputs: IOInfo
    outputs: IOInfo
    attr: _containers.RepeatedCompositeFieldContainer[AttributeProto]
    def __init__(self, id: _Optional[int] = ..., name: _Optional[str] = ..., type: _Optional[_Union[NodeType, str]] = ..., ctrl_deps: _Optional[_Iterable[int]] = ..., data_deps: _Optional[_Iterable[int]] = ..., start_time_micros: _Optional[int] = ..., duration_micros: _Optional[int] = ..., inputs: _Optional[_Union[IOInfo, _Mapping]] = ..., outputs: _Optional[_Union[IOInfo, _Mapping]] = ..., attr: _Optional[_Iterable[_Union[AttributeProto, _Mapping]]] = ...) -> None: ...
TaekyungHeo commented 1 month ago

Please check if the latest PyTorch nightly build works.

lulala-s commented 1 month ago

Please check if the latest PyTorch nightly build works.

you mean it needs to install the pytorch to fix this mistake? It's a new container in Docker , I didn't install the pytorch

TaekyungHeo commented 1 month ago

I misread your initial issue, assuming that it was about the PyTorch converter. It must be a bug in the text converter. We have not had a chance to actively support the text converter. The bug in the text converter needs to be fixed. The field name has been changed from parent to data_deps. Please check if it works. You can contribute to Chakra, or you can wait for PRs to fix the error.

lulala-s commented 1 month ago

I misread your initial issue, assuming that it was about the PyTorch converter. It must be a bug in the text converter. We have not had a chance to actively support the text converter. The bug in the text converter needs to be fixed. The field name has been changed from parent to data_deps. Please check if it works. You can contribute to Chakra, or you can wait for PRs to fix the error.

ok,you mean if I modify the code and it works,then I can contribute to the specific version of the Chakra ,right?

TaekyungHeo commented 1 month ago

Yes, correct.

lulala-s commented 1 month ago

Yes, correct.

I have already corrected the mistake, but when I want to pull a new request, I can't find the specific version named "7a5faa8" which is used by astra-sim

lulala-s commented 1 month ago

should I pull a request to the opt-bugfix branch?

TaekyungHeo commented 3 weeks ago

should I pull a request to the opt-bugfix branch?

You can create a pull request based on the main branch.

lulala-s commented 1 week ago

ok,I will pull a request based on the main brach