mjack3 commented 2 years ago

Hello.

I would like to open this issue to talk about this project. I am also interested in developing this project and would be great to share information as the paper doesn't give deeply information about the implementations and offical code is no available.

If you are agree with this iniciative, firstly we could simplify the project to use Wide-ResNet50 in order to get comparative results with the previous researching. I would like to start from the begining of the paper when says:

For ResNet, we directly use the features of the last layer in the first three blocks, and put these features into three corresponding FastFlow model.

This make me thing that in the implementation we need to use the features after the input layer, layer 1 and layer 2. In this way this table 6 makes sense

But can not to imagine how to concatenate this information for make it sense with the next

In the forward process, it takes the feature map from the backbone network as input

Depending of what part you read, it seems that just one feature map or 3 are taken

maaft commented 2 years ago

If authors only would publish their f** code. It should be mandatory lol.

mjack3 commented 2 years ago

I dont get the result of the paper :(

mjack3 commented 2 years ago

The code to my repo is opened. I welcome any contribution there.

https://github.com/mjack3/FastFlow-AD

gathierry commented 2 years ago

I managed to achieve comparable performance in a tricky way. I add LayerNorm layers before sending each feature map to NFs. It is "tricky" since the usage is different for different backbones but that's the only way works for me:

resnet18 and wide-resnet-50: use trainable LayerNorm
CaiT and DeiT: use the final norm from the pre-trained model and fix it's affine parameters

I opened my code as well https://github.com/gathierry/FastFlow

maaft commented 2 years ago

@gathierry wow, LayerNorm really boosts the performance enormously. Thank you very much for that and for opening your code! :)

maaft commented 2 years ago

Does anyone have a clue how we could measure model performance without having labeled ground-truth test data?

mjack3 commented 2 years ago

Hello guys. Someone know what does mean "w/o" and "w" in the first column of this table?

Howeng98 commented 2 years ago

@mjack3 w/o = without w = with

mjack3 commented 2 years ago

@gathierry I 'm wondering about why you do this in lanes https://github.com/gathierry/FastFlow/blob/master/fastflow.py#L148-L151

gathierry commented 2 years ago

@mjack3 why I'm doing exponential? Because the values in different levels should be converted to a probability before merged. I want to project the output to its probability in gaussian distribution. And I guess this can also normalize values in different levels to the same range. Otherwise there can be one level dominating the result.

mjack3 commented 2 years ago

@gathierry is more about the -torch.mean and the interpolation of the negative

gathierry commented 2 years ago

@mjack3 oh about that.

-torch.mean: a gaussian should be like torch.exp(-0.5*torch.sum(output**2)). However, sum make the value tooooo small and it's not comparable between different levels as their channel numbers are different. I see CFlow divided the module of output**2 by channel number. So I do the same, which make me replace sum with mean
the interpolation of the negative: the result is the probability and anomaly score should be the opposite. Resolution of different levels are not identical so we have to resize them before merging.

mjack3 commented 2 years ago

@gathierry thanks for your answer.

Why are you using that loss function instead something like this? link

Am I missing something?

gathierry commented 2 years ago

@mjack3 I am using the same loss function https://github.com/gathierry/FastFlow/blob/d275b79d47d6e272115d45fd7fc0f29cca0f5107/fastflow.py#L139

What you mentioned before https://github.com/gathierry/FastFlow/blob/master/fastflow.py#L148-L151 was just for inference.

Howeng98 commented 2 years ago

@gathierry From your implementation, your AUROC is refer to seg performance right?

gathierry commented 2 years ago

@Howeng98 yes, the pixel level AUROC

briliantnugraha commented 2 years ago

Does anyone have a clue how we could measure model performance without having labeled ground-truth test data?

Hi @maaft , IMO, if you are trying to measure model performance in accuracy/percentage (%). However, if you are using distance measurements (kNN/aNN/euclidean etc), then it should be possible, with the caveat that you will still need to verify the measurements (manual label by human eyes) in the end in order to map distance mesurement -> accuracy. Hope this helps :).

Note: Very cool progress and results for you guys, considering how little-informed the paper is.

questionstorer commented 2 years ago

This is great discussion. I also performed some experiments using wide_resnet_50_2.

I wrote from scratch the various modules in fastflow myself such as actnorm, affine coupling and channel permute, split and merge and combine them in a module fastflow_head (following the framework here). I didn't use the FrEIA framework just because I think the framework I follow is more transparent for me.
Model details. I largely follow the paper's setting by using a flow step(called fastflow_head in my code) consisting of actnorm -> channel random shuffling -> two affine couplings acting on different halves of channels with Conv-RELU-Conv subnets. There are things that are not clearly specified in the paper and let me try to make it explicit here.
- I use 3 feature maps from 3 layers of wide_resnet_50_2. On MvTec, they have sizes (256, 64, 64), (512,32,32) and (1024, 16,16) respectively. I followed the paper to build 3 fastflow models using these 3 feature maps
- hidden channels in the Conv-RELU-Conv block: I use the same number as the number of input channels to the first Conv, that is, half the total channels in the original input from feature extractor. For wide_resnet_50_2, the hidden channels are respectively 128, 256, 512. I think there is some ambuiguity in the original paper, they say the number of input and output channels of the subnet are the same, but that's not the case.
- For actnorm, I initialize the log-scale and bias with all zeros. This is different from the intialization in the GLOW paper. But I think there is no reason to initialize with mean and std of the activation in our case. In experiment, I found the trivial initialization makes it easier to train.
- For optimizer, I follow the paper.
- For loss, I use the same loss as the CFlow paper
- For the number of flow steps, I have 4 flow steps with alternating 3x3 convs and 1x1 convs instead of 8 flow steps as specified in the paper. I found this makes a very big difference. I observed very unstable training for 8 flow steps and sometimes NaN losses or gradients. 4 flow steps is easier to train and it matches the parameter number specified in the paper.
- For number of epochs, I use 500 epochs. I also observed that the model stabilize or reach the best performance in a few steps. But in my own project(where there is no labelled groundtruth for test), I use some good object as valid dataset and observe that the loss on valid data reach minimum at around 250 epochs.
- For final anomaly map, I average on anomaly map from 3 fastflow models. For each fastflow model, I took the same method as in gathierry's approach There can be understood as taking product of the probability along channels.
- Under the above specifications, the number of parameters is 41.3M. Using 8 flow steps doubles this number to 82.6M. Doubling the number of hidden channels also doubles this number to 82.6.
I did not experiment on all category of the MvTec dataset. I only experiment on bottle category and see that it's pixel-wise AUROC matches the result from paper in just a few steps. Maybe other categories require more training epochs? That I'm not sure

mjack3 commented 2 years ago

Guys, what's mean this in the figure 3? What is y' ?

AlessioGalluccio commented 2 years ago

Guys, what's mean this in the figure 3? What is y' ?

y' is the output of the coupling flow, while y is the input. Coupling flows work by splitting the input and keeping a set unchanged ("a" in this example) and modifying the other ("b" in this example). This is needed to have invertibility and easy Jacobian determinant to compute, since the Jacobian matrix becomes triangular.

HanMinung commented 5 months ago

I am very grateful for your GitHub source. However, I have a few questions regarding the use of the model. As the epochs progress, the loss value appears as follows, and I wonder if this is correct. Secondly, I placed the mvtec-ad data file in the path where the source files are gathered, but the following error occurs. Could you tell me what the problem is?

lil-wayne-0319 commented 3 months ago

I am very grateful for your GitHub source. However, I have a few questions regarding the use of the model. As the epochs progress, the loss value appears as follows, and I wonder if this is correct. Secondly, I placed the mvtec-ad data file in the path where the source files are gathered, but the following error occurs. Could you tell me what the problem is?

maybe you can check there, 'dataset.py' Line 93： target = Image.open(

image_file.replace("/test/", "/ground_truth/").replace(

                image_file.replace("test", "ground_truth").replace(
                    ".png", "_mask.png"
                )

Rappy325 commented 1 month ago

Hello, thanks for the repo, I'm currently studying about the fastflow model and tried running the code, but it turns out that there's an error during the eval step. I wonder why did that happen. Could you help with me? error

AlessioGalluccio / FastFlow

Q&A #14

image_file.replace("/test/", "/ground_truth/").replace(