Open mjack3 opened 2 years ago
Hi @mjack3, I'm really glad to find some help in this project. Thank you very much for your proposal, I accept. This paper is quite obscure. The problem you are addressing is explained in paragraph 4.7:
For ResNet18 and Wide-ResNet50-2, we directly use the features of the last layer in the first three blocks, put these features into the 2D flow model to obtain their respective anomaly detection and localization results, and finally take the average value as the final result.
I think that the paper wants us to build three different models and average their anomaly score. But how do we compute this anomaly score? This is the question that I can't solve. In the introduction we can find that:
We propose a 2D normalizing flow denoted as FastFlow for anomaly detection and localization with fully convolutional networks and two-dimensional loss function to effectively model global and local distribution.
But I can't find how this two-dimensional loss is defined. If you have an idea of good two-dimensional loss for this problem, I'm all ears. Best, Alessio
hummm yes you are right, definitevly we need to create 3 fastflow models..i will try. By the way, you can find my implementation here https://github.com/mjack3/EasyFastFlow feel you free to use what you want
Have you try contacting to some of the main authors of the paper? I googled them but didn't find the email
@mjack3 Hi, have you take a look about the CFLOW-AD? It also implemented by FLOW model, maybe it can help you to understand how 3 Fastflow module work. I'm trying to implement Fastflow by modify Cflow-AD. If you need any help or discuss, I would like to help (if I can).
@Howeng98 you are welcome =)
Yes I also looked the CSFLOW-AD code but I am not sure if here, we need to create 3 individual fastFLow model and training with 3 optimizers (one for FastFLow) or doing similar to CSFLOW-AD
@mjack3 I tried to contact Yushuang Wu through a university e-mail I found, but I got no answer. I haven't found the e-mail of the other authors
When did you contact them ?@AlessioGalluccio
Hi @mjack3, Can you please share your implementation of FastFlow? The link seems to be deactivated. Thanks
Currently i am obliged to make the code in private because my job contract. I hope to open it soon. Anyway I will share information in this same thread if is needed :)
@AlessioGalluccio just a small remark: For anomaly score calculation (global and pixelwise) you need to use p(z) and not z which you are currently using.
you can estimate logp(z) (and therefore p(z)) analogous to the pytorch implementation of CFlow AD.
Hi @maaft, did you manage to achieve a similar result as the claimed? I tried both the way of CFlow and DifferNet but still far below the performance in the paper.
Another confusion for me is that I cannot get the same A.d param#: I take each FlowStep as one AllInOneBlock from FrEIA, with 2 convolution layers This is my counting result (and paper counting result in parentheses)
CaiT: 7,043,780 (14.8M)
DeiT: 7,043,780 (14.8M)
Resnet18: 4,650,240 (4.9M)
WideResnet50: 41,309,184 (41.3M) -> this one is matched
Here's code I used to compute param#
def count_params_per_flow_step(k, cin, ratio):
cout = 2 * cin
cmed = int(cin * ratio)
w1 = k * k * cin * cmed
b1 = cmed
w2 = k * k * cmed * cout
b2 = cout
return w1 + w2 + b1 + b2
def count_total_params(num_steps, conv3x3_only, feature_channels, ratio):
s = 0
for channels in feature_channels:
for i in range(num_steps):
k = 1 if (i % 2 == 1 and not conv3x3_only) else 3
s += count_params_per_flow_step(k, channels // 2, ratio)
return s
print("CaiT: ", count_total_params(20, False, [768], 0.16))
print("DeiT: ", count_total_params(20, False, [768], 0.16))
print("Resnet18: ", count_total_params(8, True, [64, 128, 256], 1.0))
print("WideResnet50: ", count_total_params(8, False, [256, 512, 1024], 1.0))
@gathierry no, I don't think that I can match the scores in the paper (didn't evaluate it yet, only visually). In particular, the transistor class (broken legs) does not learn at all.
I'll evaluate auroc etc next week and report back.
Also I tried different backbones than resnet18, that achieve higher accuracy on imagenet (e.g. EfficientNet) and noticed that the training has a very hard time to converge at all. No idea, why this is the case.
My tests of this code (24 Epochs) shows acceptable results only for Resnet18 and only for three mvtec classes:
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
| AUROC-MAX | AUCPR-MAX -- | -- | -- Bottle | 0.9849 | 0.9955 Screw | 0.9859 | 0.9959 Wood | 0.9956 | 0.9987
Hello.
I would like to open this issue to talk about this project. I am also interested in developing this project and would be great to share information as the paper doesn't give deeply information about the implementations and offical code is no available.
If you are agree with this iniciative, firstly we could simplify the project to use Wide-ResNet50 in order to get comparative results with the previous researching. I would like to start from the begining of the paper when says:
This make me thing that in the implementation we need to use the features after the input layer, layer 1 and layer 2. In this way this table 6 makes sense
But can not to imagine how to concatenate this information for make it sense with the next
Depending of what part you read, it seems that just one feature map or 3 are taken