Open TylerLeonhardt opened 3 years ago
Just before the merge, I updated the model that was in the PR with a better trained one https://github.com/yoeo/guesslang/pull/33/commits/198352a0027199f29c995afe9db5a66dd9403e99 . Maybe you're using the model that was pushed just before this change.
If that's the case, you can now find the lastest model in the main branch https://github.com/yoeo/guesslang/tree/master/guesslang/data/model It is still not as precise as the 30-languages model, but and it produces better results with your Typescript example:
✓ echo $'function makeThing(): Thing {
let size = 0;
return {
get size(): number {
return size;
},
set size(value: string | number | boolean) {
let num = Number(value);
// Don\'t allow NaN and stuff.
if (!Number.isFinite(num)) {
size = 0;
return;
}
size = num;
},
};
}' | guesslang -p
Language name Probability
TypeScript 32.21%
JavaScript 9.24%
Rust 7.14%
C# 5.80%
C 4.71%
Lua 4.59%
@TylerLeonhardt, just out of curiosity, how do you convert the model to TensorflowJS?
I tried with the current stable version of tensorflowjs
, with no special options, and got an error:
tensorflowjs_converter --input_format=tf_saved_model ./guesslang/data/model /tmp/web_model
...
E tensorflow/core/grappler/grappler_item_builder.cc:669] Init node head/predictions/class_string_lookup/table_init/LookupTableImportV2 doesn't exist in graph
...
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
Traceback (most recent call last):
File ".../bin/tensorflowjs_converter", line 8, in <module>
sys.exit(pip_main())
File ".../lib64/python3.9/site-packages/tensorflowjs/converters/converter.py", line 813, in pip_main
main([' '.join(sys.argv[1:])])
File ".../lib64/python3.9/site-packages/tensorflowjs/converters/converter.py", line 817, in main
convert(argv[0].split(' '))
File ".../lib64/python3.9/site-packages/tensorflowjs/converters/converter.py", line 803, in convert
_dispatch_converter(input_format, output_format, args, quantization_dtype_map,
File ".../lib64/python3.9/site-packages/tensorflowjs/converters/converter.py", line 523, in _dispatch_converter
tf_saved_model_conversion_v2.convert_tf_saved_model(
File ".../lib64/python3.9/site-packages/tensorflowjs/converters/tf_saved_model_conversion_v2.py", line 683, in convert_tf_saved_model
optimize_graph(frozen_graph, signature,
File ".../lib64/python3.9/site-packages/tensorflowjs/converters/tf_saved_model_conversion_v2.py", line 153, in optimize_graph
raise ValueError('Unsupported Ops in the model before optimization\n' +
ValueError: Unsupported Ops in the model before optimization
OptionalNone, ReadVariableOp, OptionalFromValue
@pyu10055 gave me this pointer in https://github.com/tensorflow/tfjs/issues/4838#issuecomment-866416464
tensorflowjs_converter --input_format=tf_saved_model --skip_op_check model web_model
Maybe you're using the model that was pushed just before this change.
hmm I grabbed https://github.com/yoeo/guesslang/tree/master/guesslang/data/model this morning actually so I'm fairly certain I have the correct one... I wonder if there's a loss in confidence during the conversion to the tfjs model 🤔
Cool the conversion now works.
I can see that the converter prints messages about various optimisations.
I especially suspect that the int64
to int32
conversion have an impact the model accuracy.
I especially suspect that the int64 to int32 conversion have an impact the model accuracy.
Maybe @pyu10055 has guidance here? Or perhaps @dynamicwebpaige?
@TylerLeonhardt @yoeo We do convert the int64 to int32, but those are not weight related if I understand correctly, most of them are ids. The missing ops errors can be ignored, since those are maybe from some of the training functions not used in the inference graph.
The missing ops errors can be ignored, since those are maybe from some of the training functions not used in the inference graph.
@pyu10055 OK.
During the training phase I do use I/O functions to read & process the examples and as you spotted these functions are not used for inference.
I'm not sure how much can be done here but I thought I'd start a discussion.
Here's the following TypeScript snippet:
Which yields the following confidence:
Before #33, the confidence for TS was way over 40%. I'm currently saying "this file is a TS file if the model is at least 20% more confident than the next language" but unfortunately, this fails.
I'm a bit nervous to drop that 20% down any further...
Also interesting that Rust beat out JS...