greyblake / whatlang-rs

Natural language detection library for Rust. Try demo online: https://whatlang.org/
https://whatlang.org/
MIT License
966 stars 108 forks source link

Generate language list without Tera #34

Closed elegaanz closed 5 years ago

elegaanz commented 5 years ago

Tera is pulling quite a lot of dependencies, and is only used to generate one file during build. I removed it and replaced it with a pure Rust implementation. When cargo testing, I went from 132 dependencies to 96.

I can understand you refuse this change, as it may be less readable than the current version, but it makes build times shorter (I'm using whatlang in a project with 400 dependencies, and it takes 20 minutes to build, so if I could avoid Tera and its dependencies it would be great).

Another solution would be to generate this file once and for all, as it is probably not updated very often.

greyblake commented 5 years ago

Hey, thanks your effort. Your changes are generally revert for this PR: https://github.com/greyblake/whatlang-rs/pull/23

So, Tera was there exactly on purpose. It makes it easier to maintain that automatically generated code.

I am not sure I'd like to take this changes (I hope you can understand my reasoning). The ideal solution would be to keep Tera, but generate lang.rs file only once with some kind of make task or something.

If you have any ideas that could solve the problem in this way, I would really appreciate!

elegaanz commented 5 years ago

Oh, I didn't saw you were already doing that before. And I totally understand that you would prefer to keep Tera.

Are the data files pulled from somewhere else, or are they just a nicer way to store these information? If the file format doesn't matter, maybe a macro-could generate repetitive Rust code while keeping things readable, and removing the need for a templating engine. Tell me if this solution sounds good to you or not.

greyblake commented 5 years ago

Are the data files pulled from somewhere else, or are they just a nicer way to store these information

Data files are taken from similar similar lib for JS, Franc: https://github.com/wooorm/franc/blob/master/packages/franc-all/data.json These data are obtain from text corporas.

If the file format doesn't matter, maybe a macro-could generate repetitive Rust code while keeping things readable, and removing the need for a templating engine.

Initially this was implemented as ruby scripts. That there was a PR that moved this into build.rs. I liked it the idea, but did not liked how it looked like. Finally now I think, I may come back to Ruby script :D

greyblake commented 5 years ago

Probably I compromise would be to use a little supportive Rust program, instead of current build.rs

elegaanz commented 5 years ago

Does it seem to be a good idea to add it to misc/update_support_languages.rb? It would centralize all the parsing and code generation in one place, and Ruby supports templating with ERB out of the box.

greyblake commented 5 years ago

Closed in favor of https://github.com/greyblake/whatlang-rs/pull/37