Closed julien-c closed 4 years ago
Most likely these are bugs in the converter since the operators you mentioned, where
, masked_fill
, softmaxND
and gelu
with 3 different modes are all supported in the CoreML spec.
what difference does it make to the onnx grpah, when nn.Softmax(dim=-1)
is replaced by nn.functional.softmax
?
@julien-c We have added support for Where op with #487 regarding SoftMax layer, we are using Old Softmax layer by default due to overflow issue. Old softmax layer is rank dependent which is blocked on ONNX shape inference.
But, if we use custom_conversion_function
to use new softmax layer, model is converting with good SNR and PSNR score
Start Scores: SNR 104.39693320792118, PSNR 86.34187544571695
End Scores: SNR 103.61101916379764, PSNR 86.12284546374713
Please use following script to convert model
from pytorch_transformers.modeling_distilbert import DistilBertForQuestionAnswering
from onnx_coreml import convert
import torch
import numpy as np
model = DistilBertForQuestionAnswering.from_pretrained(
"distilbert-base-uncased-distilled-squad", torchscript=True
)
# torch.save(model, './distilbert.pt')
model.eval()
torch.onnx.export(
model,
torch.ones(1, 128, dtype=torch.long),
"distilbert-squad-128.onnx",
verbose=True,
input_names=["input_ids"],
output_names=["start_scores", "end_scores"],
)
def _convert_softmax(builder, node, graph, err):
'''
convert to CoreML SoftMax ND Layer:
https://github.com/apple/coremltools/blob/655b3be5cc0d42c3c4fa49f0f0e4a93a26b3e492/mlmodel/format/NeuralNetwork.proto#3547
'''
axis = node.attrs.get('axis', 1)
builder.add_softmax_nd(
name=node.name,
input_name=node.inputs[0],
output_name=node.outputs[0] + ('_softmax' if node.op_type == 'LogSoftmax' else ''),
axis=axis
)
if node.op_type == 'LogSoftmax':
builder.add_unary(
name=node.name+'_log',
input_name=node.outputs[0]+'_softmax',
output_name=node.outputs[0],
mode='log'
)
mlmodel = convert(model="./distilbert-squad-128.onnx", target_ios="13",
custom_conversion_functions={'Softmax':_convert_softmax})
mlmodel.save('./converted.mlmodel')
Validated model accuracy as follows:
import onnx
import onnxruntime as rt
import coremltools
import torch
import numpy as np
def _compute_SNR(x,y):
noise = x - y
noise_var = np.sum(noise ** 2)/len(noise) + 1e-7
signal_energy = np.sum(y ** 2)/len(y)
max_signal_energy = np.amax(y ** 2)
SNR = 10 * np.log10(signal_energy/noise_var)
PSNR = 10 * np.log10(max_signal_energy/noise_var)
return SNR, PSNR
spec = coremltools.utils.load_spec('./converted.mlmodel')
mlmodel = coremltools.models.MLModel(spec, useCPUOnly=True)
input = np.random.randint(0, high=1000, size=(1, 128))
input_dict = {'input_ids': input.astype(np.float32)}
pred_coreml = mlmodel.predict(input_dict, useCPUOnly=True)
model = torch.load('distilbert.pt')
pred_pt = model(torch.from_numpy(input).type(torch.LongTensor))
pt_out = {}
pt_out['start_scores'] = pred_pt[0].detach().numpy()
pt_out['end_scores'] = pred_pt[1].detach().numpy()
snr, psnr = _compute_SNR(pred_coreml['start_scores'], pt_out['start_scores'])
print('Start Scores: SNR {}, PSNR {}'.format(snr, psnr))
snr, psnr = _compute_SNR(pred_coreml['end_scores'], pt_out['end_scores'])
print('End Scores: SNR {}, PSNR {}'.format(snr, psnr))
@julien-c could you please give a try to above script with tot (from source onnx-coreml)?
Thank you @bhushan23, it works! I've pushed the models to our repo (with credits) and tweeted a link to the release: https://twitter.com/julien_c/status/1181615276439330816
We'll integrate the model inside our demo Squad app later today.
Just updated our demo app to use onnx-coreml
converted DistilBERT 🎉
Inference on device is ~35% faster ⚡️
Merged PR is: https://github.com/huggingface/swift-coreml-transformers/pull/13
Awesome!! @julien-c could you please give more details? Faster than tf-lite?
@bhushan23 DistilBERT is ~35% faster than (full) BERT on device (while keeping 97% of the accuracy on Squad)
I'm trying to convert an ONNX export of DistilBERT to CoreML, using the following code:
I've hand-converted this model to CoreML before (as well as GPT-2, in this repo), but for ease of use and scalability to future models I would like to use
onnx-coreml
in a more seamless way.I'm encountering a few different roadblocks:
NotImplementedError: Unsupported ONNX ops of type: Where
ontorch.masked_fill_
. Would this operation be supported at some point? In the meantime, I can probably work around this by changing my PyTorch code to another equivalent construct.nn.Softmax(dim=-1)
layer. Replacing it withnn.functional.softmax
seems to work.torch.erf
in gelu fails. TODO: Check why and see whytorch.nn.functional.gelu
is not converted to ONNX.Here's the ONNX file: https://s3.amazonaws.com/models.huggingface.co/bert/distilbert-squad-128.onnx
Help would be super appreciated!