Difference between original model and conservative model?

smousavi05 / EQTransformer

EQTransformer, a python package for earthquake signal detection and phase picking using AI.

https://rebrand.ly/EQT-documentations

MIT License

301 stars 148 forks source link

Difference between original model and conservative model? #138

Closed JoshRichW closed 2 years ago

JoshRichW commented 2 years ago

Could you add some documentation explaining the difference between the original model and the conservative model in ModelsAndSampleData?

The difference in level of detection between the two is quite remarkable, and I'd be interested to know part of what causes that.

Also thanks for sharing the code and models for EQTransformer, it is much appreciated.

smousavi05 commented 2 years ago

@JoshRichW There is not much difference between the two models regarding the network architecture (only 1 or 2 layers). The main difference comes from the training procedure and the hyperparameters used for data augmentation. So if you are interested in maximizing the number of detections and are not concerned about the false positive rates (which you can remove in association and location steps), you should use the original model with higher threshold values. In contrast, if you care about only detecting true events, the conservative model (with much lower threshold levels) suits you better.

JoshRichW commented 2 years ago

Thank you for the information. Are you able to comment on the differences in training/hyperparameters? I'm looking at training the model on a different dataset so any tips/tricks you can share would be great.

smousavi05 commented 2 years ago

@JoshRichW well there is no general rule for hyperparameter tuning. It highly depends on the characteristics of your data, how the model performed initially on it, and what issues you are trying to address/improve.

JoshRichW commented 2 years ago

Ok, thank you for the information