dallascard / DWAC

Deep Weighted Averaging Classifiers
23 stars 4 forks source link

How to implement in practical #2

Closed dsvrsec closed 4 years ago

dsvrsec commented 4 years ago

I would to like to experiment the DWAC with a use case.Is it possible only using GPU and also can you provide some material regarding the experimentation.

dallascard commented 4 years ago

Yes, there are several example use cases in this repo, in the cifar, mnist, tabular, and text directories. For example, to train a DWAC classifier on the MNIST digits, run python -m mnist.run --model dwac --device 0 (use -h to see more options) If you want to run it on a CPU instead of a GPU, just drop the --device 0 (The CPU is used by default if a GPU device number is not given).

dsvrsec commented 4 years ago

Thank you..and can you please help me on ,how can the results can be interpreted.

dallascard commented 4 years ago

Sure! By default, the output is saved in data/temp/ (you can change this using the --output-dir option). In that directory, there will be one file for each of train, dev, and test. Each of them is a npz file, which you can load using numpy.load(). Each of those should have a key (called 'z' I think), which points to the final output representations (n_instances x output_dimensions). Each also contains a list of indices, which is the order of the instance in the original data set. The model is also saved, so you could later reload it. However, there is currently no additional code to make predictions from a saved model on new data, so you would need to write that for yourself.

dsvrsec commented 4 years ago

Thank you so much...I am able to do it,but ,is there any kind of visualization that shown as proof of interpretation and also I have few queries.Can you clarify them.. 1.Why there is a mention of order of instances,as the weights can be assigned to corresponding instances(or images) ? 2.What exactly is the difference between conformity and non conformity? Thanks..

dallascard commented 4 years ago

No, unfortunately you will need to make your own visualizations depending on your application.

  1. The only reason I mention the order of instances is if you want to refer back to the original data (e.g. to look at which test point is being embedded where)
  2. I suggest you read the paper for more details about conformal methods. Nonconformity is a metric which captures how "unusual" an instance would be with a particular label. Conformity is the opposite.
dsvrsec commented 4 years ago

Thank you for your response..I have gone through the paper. Can you please correct my understanding regarding the core concept. 1.Instead of using softmax layer to arrive at probability scores,the weighted sum of all training instances are calculated and after that non conformal score is calculated based on k value instead of all the training instances. 2.what do you exactly refer to low dimensional space in Model details (section 3.1).

dallascard commented 4 years ago
  1. You are correct that you take a weighted sum over the label vectors from all training instances. That could be used directly to make a prediction (i.e. using argmax), but we can also use the validation data to calibrate this prediction. I'm not sure I understand what you're asking here, but ultimately the clearest description I can give is the equations in the paper.
  2. I'm not sure what you are asking here, but I think what you're referring to is that x (high dimensional) is embedded to a lower dimensional representation (h).
dsvrsec commented 4 years ago

Thanks for the response.I have some few queries after going through the paper again... i)I think the weights are calculated based on Gaussian kernel...Are the weights caluclated based on the distances between present training instances and remianing instances(like similarity between images?)Is this done for every image during training?Then the matrix would be sparse right ?(as mentioned) ii) How the instances are taken to a lower dimensional space?