kadirnar / segment-anything-video

MetaSeg: Packaged version of the Segment Anything repository
Apache License 2.0
952 stars 67 forks source link

onnx script fix and added as main package #73

Closed onuralpszr closed 1 year ago

onuralpszr commented 1 year ago

Expected results for make sure scripts works also checked for import fixes

 python export_onnx_model.py -h
usage: export_onnx_model.py [-h] --checkpoint CHECKPOINT --output OUTPUT [--model-type MODEL_TYPE] [--return-single-mask] [--opset OPSET]
                            [--quantize-out QUANTIZE_OUT] [--gelu-approximate] [--use-stability-score] [--return-extra-metrics]

Export the SAM prompt encoder and mask decoder to an ONNX model.

options:
  -h, --help            show this help message and exit
  --checkpoint CHECKPOINT
                        The path to the SAM model checkpoint.
  --output OUTPUT       The filename to save the ONNX model to.
  --model-type MODEL_TYPE
                        In ['default', 'vit_b', 'vit_l']. Which type of SAM model to export.
  --return-single-mask  If true, the exported ONNX model will only return the best mask, instead of returning multiple masks. For high resolution
                        images this can improve runtime when upscaling masks is expensive.
  --opset OPSET         The ONNX opset version to use. Must be >=11
  --quantize-out QUANTIZE_OUT
                        If set, will quantize the model and save it with this name. Quantization is performed with quantize_dynamic from
                        onnxruntime.quantization.quantize.
  --gelu-approximate    Replace GELU operations with approximations using tanh. Useful for some runtimes that have slow or unimplemented erf ops,
                        used in GELU.
  --use-stability-score
                        Replaces the model's predicted mask quality score with the stability score calculated on the low resolution masks using
                        an offset of 1.0.
  --return-extra-metrics
                        The model will return five results: (masks, scores, stability_scores, areas, low_res_logits) instead of the usual three.
                        This can be significantly slower for high resolution outputs.
kadirnar commented 1 year ago

Thank you so much🎉