Closed jiaqima closed 1 week ago
@TheaperDeng, one thing I'm not sure is if we want to set the default transform functions for both train and test as identical transformation? As in many cases, only test rep needs to be transformed.
@TheaperDeng, one thing I'm not sure is if we want to set the default transform functions for both train and test as identical transformation? As in many cases, only test rep needs to be transformed.
It happens that I am working on this. I think one of the possible way is that, we just make transform_test_rep
to be abstractmethod
, that is users always need to implement this one. and for transform_train_rep
, we can have a default implementation that returns the train rep directly.
Or, we directly set them both to be defaultly return the rep directly. Users can override them when design their TDA methods.
@TheaperDeng, one thing I'm not sure is if we want to set the default transform functions for both train and test as identical transformation? As in many cases, only test rep needs to be transformed.
It happens that I am working on this. I think one of the possible way is that, we just make
transform_test_rep
to beabstractmethod
, that is users always need to implement this one. and fortransform_train_rep
, we can have a default implementation that returns the train rep directly.Or, we directly set them both to be defaultly return the rep directly. Users can override them when design their TDA methods.
I think the latter is Ok. Users implementing new attributors should be a bit more tolerant than users who just use the attributors, and they don't need an abstract method to remind them this is something they should override.
@TheaperDeng, one thing I'm not sure is if we want to set the default transform functions for both train and test as identical transformation? As in many cases, only test rep needs to be transformed.
It happens that I am working on this. I think one of the possible way is that, we just make
transform_test_rep
to beabstractmethod
, that is users always need to implement this one. and fortransform_train_rep
, we can have a default implementation that returns the train rep directly. Or, we directly set them both to be defaultly return the rep directly. Users can override them when design their TDA methods.I think the latter is Ok. Users implementing new attributors should be a bit more tolerant than users who just use the attributors, and they don't need an abstract method to remind them this is something they should override.
Yeah, I think so. And this makes the innerproduct attributor can be directly instantiated and works like Grad-Dot, just like its name.
@tingwl0122 could you please take a look at the new implementation of the Arnoldi attributor and see if there is anything wrong?
@tingwl0122 could you please take a look at the new implementation of the Arnoldi attributor and see if there is anything wrong
I think the overall structure looks good to me. (Moving projection matrix computation to cache has been listed as one todo) Another thing is that the original Arnoldi paper actually used only a random batch of the full dataloader to do the algorithm, which I didn't include before. We can do something on the data_target_pair
before feeding it into func
to achieve this now. (also adding an additional param to control this random batch's size)
@tingwl0122 could you please take a look at the new implementation of the Arnoldi attributor and see if there is anything wrong
I think the overall structure looks good to me. (Moving projection matrix computation to cache has been listed as one todo) Another thing is that the original Arnoldi paper actually used only a random batch of the full dataloader to do the algorithm, which I didn't include before. We can do something on the
data_target_pair
before feeding it intofunc
to achieve this now. (also adding an additional param to control this random batch's size)
another minor thing is that we will need to force users to use .cache
before attributing. We can probably raise some errors for this.
@tingwl0122 could you please take a look at the new implementation of the Arnoldi attributor and see if there is anything wrong
I think the overall structure looks good to me. (Moving projection matrix computation to cache has been listed as one todo) Another thing is that the original Arnoldi paper actually used only a random batch of the full dataloader to do the algorithm, which I didn't include before. We can do something on the
data_target_pair
before feeding it intofunc
to achieve this now. (also adding an additional param to control this random batch's size)another minor thing is that we will need to force users to use
.cache
before attributing. We can probably raise some errors for this.
Sounds great. Thanks for the notes. I'll add them later.
@tingwl0122 could you please take a look at the new implementation of the Arnoldi attributor and see if there is anything wrong
I think the overall structure looks good to me. (Moving projection matrix computation to cache has been listed as one todo) Another thing is that the original Arnoldi paper actually used only a random batch of the full dataloader to do the algorithm, which I didn't include before. We can do something on the
data_target_pair
before feeding it intofunc
to achieve this now. (also adding an additional param to control this random batch's size)another minor thing is that we will need to force users to use
.cache
before attributing. We can probably raise some errors for this.Sounds great. Thanks for the notes. I'll add them later.
Thanks @tingwl0122 @jiaqima I think I can directly make the changes in my implementation #133
Closing this PR in favor of #133. Any follow-up discussion could be done there.
Description
This PR is a reference implementation to refactor the BaseInnerProductAttributor, which is NOT intended to be merged (and has not been tested). As an example, the Arnoldi attributor is re-implemented to demonstrate how the new API of BaseInnerProductAttributor can be used.
A few major changes in BaseInnerProductAttributor:
.attribute()
with a simple average, which lacks enough flexibility; it is now moved into the transform functions.cache()
could do more than just storing the full training dataloader. For example, the Arnoldi projector is now processed and cached in transformation_on_query in a hacky way. In this PR, this is moved to .cache()