johnmarktaylor91 / torchlens

Package for extracting and mapping the results of every single tensor operation in a PyTorch model in one line of code.
GNU General Public License v3.0
477 stars 17 forks source link

Something is error when I run torchlens on my pc #23

Closed whisperLiang closed 2 months ago

whisperLiang commented 3 months ago

D:\Application\miniconda\envs\coinfer\lib\site-packages\torch\overrides.py:110: UserWarning: 'has_cuda' is deprecated, please use 'torch.backends.cuda.is_built()' torch.has_cuda, D:\Application\miniconda\envs\coinfer\lib\site-packages\torch\overrides.py:111: UserWarning: 'has_cudnn' is deprecated, please use 'torch.backends.cudnn.is_available()' torch.has_cudnn, D:\Application\miniconda\envs\coinfer\lib\site-packages\torch\overrides.py:117: UserWarning: 'has_mps' is deprecated, please use 'torch.backends.mps.is_built()' torch.has_mps, D:\Application\miniconda\envs\coinfer\lib\site-packages\torch\overrides.py:118: UserWarning: 'has_mkldnn' is deprecated, please use 'torch.backends.mkldnn.is_available()' torch.has_mkldnn, Traceback (most recent call last): File "d:\ProgramCode\torchlens\debug_torchlens.py", line 10, in model_history = tl.log_forward_pass(model, x, vis_opt='unrolled') File "d:\ProgramCode\torchlens\torchlens\user_funcs.py", line 144, in log_forward_pass model_history.render_graph( TypeError: ModelHistory.render_graph() takes from 1 to 8 positional arguments but 14 were given 微信截图_20240801220554

johnmarktaylor91 commented 3 months ago

Thanks for raising this and apologies for the bug. Hmm, it's working for me on the latest pip release. If you pip uninstall torchlens and then reinstall, do you still get the same error?

whisperLiang commented 3 months ago

Thanks for raising this and apologies for the bug. Hmm, it's working for me on the latest pip release. If you pip uninstall torchlens and then reinstall, do you still get the same error?

It is ok when I run with pip install, but it meets bug when I run with git clone torchlens

whisperLiang commented 3 months ago

微信截图_20240802122205 Another question: how can I use the layer to compute or is it realized? I can not find this solution in paper and documents. Thanks for your reply and help.

johnmarktaylor91 commented 3 months ago

Hmm, are you using the most recent version of the git repo, on the main branch? Currently the render_graph function indeed takes 14 arguments, so I'm not sure why that error message is arising.

image

As for how to execute a layer, the relevant fields are: layer.func_applied: this is the function executed by the layer layer.creation_args, layer.creation_kwargs: these are the arguments that were provided to the function for that layer

By using these fields, you should be able to execute the function associated with a layer. Let me know if something doesn't work.

whisperLiang commented 3 months ago

Hmm, are you using the most recent version of the git repo, on the main branch? Currently the render_graph function indeed takes 14 arguments, so I'm not sure why that error message is arising. image

As for how to execute a layer, the relevant fields are: layer.func_applied: this is the function executed by the layer layer.creation_args, layer.creation_kwargs: these are the arguments that were provided to the function for that layer

By using these fields, you should be able to execute the function associated with a layer. Let me know if something doesn't work.

Thank you very much, I delete the git repo, then git clone it again, and it works with no bug. I use layer.func_applied and layer.func_all_args_non_tensor to execute layers successfully. Thanks for your perfect work!

johnmarktaylor91 commented 3 months ago

Awesome, glad it's working well for you :) Closing if no more issues, but feel free to reopen if more problems arise.

whisperLiang commented 2 months ago

Excuse me, I want to know if I want to get a submodel by given model_history and layer index, how should I do it?

johnmarktaylor91 commented 2 months ago

By submodel, do you mean a module inside the model that itself contains multiple operations? You can fetch the layer corresponding to an output of such a module by simply using the “address” of the module. For example, to get the output of the features block in AlexNet you would do model_history[‘features’]. You can easily see the addresses of all the modules using the visualization, or by inspecting the model’s code.

image
whisperLiang commented 2 months ago

By submodel, do you mean a module inside the model that itself contains multiple operations? You can fetch the layer corresponding to an output of such a module by simply using the “address” of the module. For example, to get the output of the features block in AlexNet you would do model_history[‘features’]. You can easily see the addresses of all the modules using the visualization, or by inspecting the model’s code.你说的子模型,是指模型内部的一个模块,它本身包含多个操作吗? 您可以通过简单地使用模块的“地址”来获取与此类模块的输出相对应的层。例如,要在 AlexNet 中获取 features 块的输出,您需要执行 model_history['features']。您可以使用可视化效果或检查模型的代码轻松查看所有模块的地址。 image

No, I want to take any given layer index and make them into sub-models. For example, Alexnet has 22 layers, and I can divide it into two sub-models at any given legal index, such as [1-14] and [15-22]. Going a step further, I can decompose the DNN in DAG format into multiple sub-models. Can the formed sub-model have the methods and attributes of the model implemented in pytorch, such as submodel.parameters()?

johnmarktaylor91 commented 2 months ago

If I understand your question, yes you can do this in PyTorch, for example:

submodel = AlexNet.features

Then you can treat this submodule as its own model, including fetching the parameters or passing to TorchLens. Was that what you wanted to know?

johnmarktaylor91 commented 2 months ago

Like this?

import torch
import torchlens as tl
import torchvision

model = torchvision.models.alexnet()
submodule = model.classifier
x = torch.rand(6, 9216)
print(list(submodule.parameters()))
model_history = tl.log_forward_pass(submodule, x, vis_opt='unrolled')
image
whisperLiang commented 2 months ago

Like this?

import torch
import torchlens as tl
import torchvision

model = torchvision.models.alexnet()
submodule = model.classifier
x = torch.rand(6, 9216)
print(list(submodule.parameters()))
model_history = tl.log_forward_pass(submodule, x, vis_opt='unrolled')
image

yes, but I hope it can be random. For example, they may not have been a big module originally.

johnmarktaylor91 commented 2 months ago

Can you clarify what you mean by random?

whisperLiang commented 2 months ago

Can you clarify what you mean by random?

For example, I choose the split layer index 12 in alexnet. As you know, it is a ReLU layer in features. Now I hope layer index [1-12] to be a submodel, and layer [13-22] to be other submodel. It is maybe difficult, because layer 13 is in features, layer 16-22 in classifier and layer 14-15 do not belong to any original submodel.

whisperLiang commented 2 months ago

Can you clarify what you mean by random?

random layer split

johnmarktaylor91 commented 2 months ago

I see, so you want to be able to split the model anywhere you want. I am thinking of eventually adding functionality to causally intervene on the computational graph using torchlens, so I’ll keep this suggestion in mind!

whisperLiang commented 2 months ago

Looking forward to your implementation. By the way, there are some interesting repositories, and I think your repository can realize same usage better than them. For example, torch_pruning pruning the model ignore the head module, which is not perfect. https://github.com/VainF/Torch-Pruning https://github.com/tianyic/only_train_once I hope it is helpful to you.

whisperLiang commented 2 months ago

One question: when I test layers in resnet18, I find buffer_layers in it which are related to batchnormal layers. However, they are useless for forward computing, thus I need filter them. So what is the hidden effect of buffer_layers?

johnmarktaylor91 commented 2 months ago

In PyTorch, “buffers” are tensors that are associated with a model, but that are not trainable parameters. They can be used for any purpose, but in BatchNorm layers they’re used to store the running mean and standard deviation of tensors during the forward pass. TorchLens tracks these for completeness.

whisperLiang commented 2 months ago

image When I test layer.func_applied in resnet18, I find that the add operation is not useful. y != x1 + x2, but y = x1

johnmarktaylor91 commented 2 months ago

If you want me to debug it helps if you paste your code instead of a picture of the code :]

whisperLiang commented 2 months ago

If you want me to debug it helps if you paste your code instead of a picture of the code :]

Thanks, here is the code:

import time import torch from torch import nn import torchvision import torchlens as tl

model = torchvision.models.resnet18(pretrained=True) o_weights = model.state_dict()

x = torch.rand(1, 3, 224, 224)

model_history = tl.log_forward_pass(model, x, vis_opt='unrolled') x1 = model_history[15].tensor_contents x2 = model_history[6].tensor_contents

add = model_history[16] add_output = add.tensor_contents

parent_params = add.parent_params args = add.func_all_args_non_tensor

y = add.func_applied(x1, x2, parent_params, args) print(add) print(model_history)

johnmarktaylor91 commented 2 months ago

Figured it out. This is because the function executed is an "in-place" add (iadd, for example x += 1), so x1 is updated in-place. This is why y=x1 in this case.

whisperLiang commented 2 months ago

Figured it out. This is because the function executed is an "in-place" add (iadd, for example x += 1), so x1 is updated in-place. This is why y=x1 in this case.

Ok, thanks for your answer. I check the finial result which is accurate, however, it is more clear when add_output = x1 + x2.

johnmarktaylor91 commented 2 months ago

Indeed, but the model uses the iadd function, so torchlens will necessarily reflect that.