V-Sekai-fire / godot-visual-anomaly-detection

MIT License
0 stars 0 forks source link

Better to crop images or use as is #1

Open fire opened 3 weeks ago

fire commented 3 weeks ago

Is it better to crop the "normal" images or combine them in a large image for anomaly detection?

image

There are two sizes. Since squares and rectangles are "normal" images, what resolution should I keep them at?

The data includes square portraits and rectangular full-body images—pallettes and sometimes clothing accessories.

Typical images would be head portraits, full body images and clothing accessories.

Abnormal images would be text and impossible body areas.

fire commented 3 weeks ago

My question was, should I crop the data to train the "normal" dataset? The alternative is masking, where I manually remove and use the remaining image as the typical image. I'm currently at 1024x1024. My first thought was to do 512x512 resolution.

fire commented 3 weeks ago

@Ivorforce says a rule of thumb is for every ten training samples; you can train one weight.

Ivorforce commented 3 weeks ago

@Ivorforce says a rule of thumb is for every ten training samples; you can train one weight.

That's called the 1 in 10 rule.

fire commented 3 weeks ago

So, what number of typical images do I need to get?

@Ivorforce doesn't know and suggests starting with 10_000 and seeing if that's enough.

fire commented 3 weeks ago

@JosephCatrambone mentioned that it is generally better to crop to the expected domain, if possible, in machine learning. See 'ROI detection' for a description of the cropping problem.

We discussed using Segment Anything 2 to generate category-to-image rectangle mappings.