Open supersonicclay opened 2 years ago
Hello @supersonicclay, This code snippet is very short and the model is not well suited for small snippets. More about that here https://guesslang.readthedocs.io/en/latest/contents.html#limitations
I can also repro with a very large JSON file.
Could you provide an example file ? That may help improve the model.
In my project that uses guesslang (it works great!), I wrap it with a very simple fallback guesser based on just the first few characters:
def guess_ext(self, code: str, probability_min: float = 0.12) -> Optional[str]:
syntax, probability = self.guesser.probabilities(code)[0]
ext = self.guesslang_syntaxes.get(syntax)
if probability >= probability_min:
return ext
for start, ext in {
'{': 'json',
'---\n': 'yaml',
'[[': 'toml', '[': 'ini',
'<?php': 'php', '<': 'xml',
'-- ': 'lua'
}.items():
if code.startswith(start):
return ext
I can also repro with a very large JSON file.
Could you provide an example file ? That may help improve the model.
I actually just tried it with a larger JSON snippet in a single line and it recognized as JSON. It just took about a second, and I think I was going to fast before and thought it wasn't recognizing.
@AndydeCleyre
I wrap it with a very simple fallback guesser based on just the first few characters
That's a very good fallback idea.
Maybe I can try to increase Guesslang machine learning model accuracy by making it to pay more attention to patterns like the ones you defined :-)
@supersonicclay
it recognized as JSON
Nice.
It just took about a second
Yes, the prediction can take some time especially when you're using the command line tool. However, you can use the Python API to make faster predictions:
# Setup everything, it can take seconds depending on your hardware configuration,
# but you only have to do it once.
from guesslang import Guess
guess = Guess()
# Then, run your predictions,
# the predictions will be computed really fast.
for code_snippet in my_code_snippets_list:
result = guess.language_name(code_snippet)
...
Repro with latest VS Code.
Example:
Result: Plain text
Expected: JSON
I can also repro with a very large JSON file.