Closed isagalaev closed 5 years ago
I'm on second varian. But it looks we need to increase major version to change this set?
Please introduce a new major version and create standalone definitions for almost all languages. :package: :tada:
Versions changing is cheap, it's a technicality :-)
@derhuerst we already have all the languages available separately on CDNs at languages/<name>.min.js
@derhuerst we already have all the languages available separately on CDNs at languages/
.min.js
Yes, but to have them as standalone CommonJS-compatible NPM modules. (;
@derhuerst the NPM build already includes all the languages, there's no point in packaging them individually.
@derhuerst the NPM build already includes all the languages, there's no point in packaging them individually.
There is, for example in reducing the bundle size when using browserify, which seems to be the standard frontend workflow nowadays. It would also help keeping major version bump down to a minimum in the core.
Moving the core detection into a separate module would also be nice, as many non-browser tools could directly use it (see #1086).
Let's keep the discussion in this issue related to the topic. We're not discussing packaging in general, we're discussing the :common set for browsers.
As for packaging, we're currently holding to a policy where we don't cater to any currently trending way of bundling but rather provide a build tool which can be used to package highlight.js in any way imaginable. We may reconsider this in the future, though.
@isagalaev I think your proposal is good: it's not based on personal preferences, instead it takes into account what users look for — therefore "common" refers to what is commonly expected.
But I would like to expand on the issue of keywords filtering:
:common :confing
) do these keywords filter out each other (ie: reduce selection only to those langs appearing in both) or do they broaden the selection (ie: select language belonging to both?)+
and -
, so users could, for example, filter :common :confing- :markup+
to have all common langs, minus config ones, plus all markup ones — how exactly to implement it is something to be though about.if I add 2 or more keywords (eg:
:common :confing
) do these keywords filter out each other (ie: reduce selection only to those langs appearing in both) or do they broaden the selection (ie: select language belonging to both?)
The latter.
I'd say the keyword filtering should allow specifications like
+
and-
I'm not aware of anyone actually needing it. There are essentially just three ways the build tool is ever used:
:common
set to host on CDNs@isagalaev Could we get some fresh download stats to reconsider this issue in 2019? If you have time.
I vote both. I don't see the harm in having "small" and "medium" builds (so 40kb and 200kb, let the user decide). I just have no idea where in the codebase I could go right now to change this, or if this is hidden away on some build server somewhere.
Perhaps @marcoscaceres knows?
I think we both decided only @isagalaev knows.
@yyyc514 @egor-rogov I get the stat by parsing logs on highlightjs.org, it's not anywhere in the code base. Current stats:
$ ./language_top.py
544 xml
521 javascript
509 css
502 json
459 sql
456 http
454 bash
446 python
444 java
443 markdown
441 php
427 ruby
420 shell
419 cpp
415 cs
406 diff
405 makefile
404 nginx
400 ini
395 apache
393 perl
391 objectivec
388 coffeescript
385 properties
371 yaml
63 go
61 scss
46 kotlin
44 r
43 dockerfile
42 powershell
39 typescript
39 plaintext
39 lua
39 less
38 rust
38 groovy
37 gradle
36 swift
35 scala
34 vim
34 awk
31 erlang
29 dart
28 arduino
27 pgsql
27 django
27 basic
26 erlang-repl
26 cmake
25 matlab
25 fsharp
25 asciidoc
25 applescript
24 haskell
23 vbscript
23 vbnet
23 elm
22 haml
21 excel
20 mathematica
20 lisp
20 ebnf
20 dos
20 armasm
20 abnf
19 protobuf
19 gherkin
19 elixir
19 bnf
18 tex
18 julia
18 dns
18 clojure
18 capnproto
18 aspectj
17 vbscript-html
17 golo
17 avrasm
17 actionscript
17 accesslog
16 thrift
16 fortran
15 x86asm
15 ruleslanguage
15 htmlbars
15 1c
14 profile
14 delphi
14 autohotkey
13 xquery
13 twig
13 scheme
13 glsl
13 dts
13 d
13 ada
12 jboss-cli
12 handlebars
12 erb
12 brainfuck
12 autoit
11 vala
11 smalltalk
11 smali
11 purebasic
11 ocaml
11 mipsasm
11 livecodeserver
11 dust
11 dsconfig
11 csp
11 cal
11 angelscript
10 zephir
10 xl
10 vhdl
10 verilog
10 stylus
10 sml
10 sas
10 puppet
10 prolog
10 pf
10 oxygene
10 openscad
10 nix
10 llvm
10 leaf
10 isbl
10 hsp
10 haxe
10 gml
10 crystal
10 crmsh
9 stan
9 sqf
9 roboconf
9 reasonml
9 processing
9 pony
9 monkey
9 ldif
9 julia-repl
9 fix
9 cos
9 coq
9 clojure-repl
9 clean
9 ceylon
9 axapta
9 arcade
8 tp
8 tcl
8 tap
8 subunit
8 step21
8 stata
8 scilab
8 rsl
8 routeros
8 rib
8 qml
8 q
8 parser3
8 nsis
8 nimrod
8 n1ql
8 moonscript
8 mojolicious
8 mizar
8 mercury
8 mel
8 maxima
8 lsl
8 livescript
8 lasso
8 irpf90
8 inform7
8 hy
8 gcode
8 gauss
8 gams
8 flix
6 taggerscript
This is, however, not a very good source of truth anyways for two reasons: it's biased (heavily) towards currently pre-selected :common
, and more importantly, it doesn't include usage statistics from linking to CDNs. The only thing I can personally recommend looking at are the languages right below the heavy top: Go, SCSS, Kotlin, R… Those are the ones people apparently bother to manually select in not insignificant numbers.
However, this entire idea may not be worth tackling by itself in light of #1759.
P.S. I've been actually very much removed from highlight.js for quite some time now. I only noticed this discussion by pure accident among another 100+ new emails I suddenly got in my inbox :-)
currently pre-selected :common
@isagalaev The question is where is this :common list and how do we update it? IE, the list you've always used to build this "canonical 40kb file"... We're not going to come up with a new build system overnight but it would be nice to update :common when we issue new releases until then.
Or perhaps even to split common into light
, medium
, heavy
, like coffee roasts, etc... who knows... I was hoping for such things to be in the codebase, but they appear to be hidden on your build server perhaps?
The question is where is this :common list and how do we update it?
Ah… This comes from metadata in language files, specifically Category:
key, like here: https://github.com/highlightjs/highlight.js/blob/master/src/languages/javascript.js#L4. These categories are used in the menu on the demo page, but the one named "common" has this special meaning of being pre-selected on the download page and also being included in the CDN build. This can indeed be updated entirely through source changes.
Making more special categories would indeed require changes to the server. But I always felt it wasn't really a solution anyway.
Well, that makes it harder for us to have TWO different builds I suppose but it's very helpful to know we can change it. :-) Thanks!
@egor-rogov After reviewing this my votes for adding to common:
-rw-r--r-- 1 jgoebel staff 732 Oct 14 18:02 src/languages/dockerfile.js
-rw-r--r-- 1 jgoebel staff 1692 Oct 14 18:00 src/languages/go.js
-rw-r--r-- 1 jgoebel staff 6414 Oct 14 18:02 src/languages/kotlin.js
-rw-r--r-- 1 jgoebel staff 5022 Oct 14 18:01 src/languages/less.js
-rw-r--r-- 1 jgoebel staff 2965 Oct 14 18:01 src/languages/lua.js
-rw-r--r-- 1 jgoebel staff 209 Oct 14 17:56 src/languages/plaintext.js
-rw-r--r-- 1 jgoebel staff 1873 Oct 14 18:02 src/languages/r.js
-rw-r--r-- 1 jgoebel staff 3523 Oct 14 18:03 src/languages/rust.js
-rw-r--r-- 1 jgoebel staff 7337 Oct 14 18:01 src/languages/scss.js
-rw-r--r-- 1 jgoebel staff 5313 Oct 14 18:00 src/languages/swift.js
-rw-r--r-- 1 jgoebel staff 5673 Oct 14 18:03 src/languages/typescript.js
All sizes uncompressed... All seem pretty tight and compact...
+41kb raw +16kb (gziped)
That's just off the top of my head.
I don't really know about R, but it's pretty small... Most of the other stuff is pretty known to be hot right now and kind of popular. Swift, Rust, Go, Kotlin, SCSS, Less, Docker, Typescript, etc...
The only thing that popped for possible demotion is CoffeeScript (4.1kb uncompressed), which has seen better days...
Oh there is PowerShell, but it's pretty heavy at 35kb, making me dislike compared to the others.
If size was no issue (or less of an issue)... ie for a "medium" build I'd just go and start picking everything I've heard of:
And that'd probably still be pretty small.
% grep -R "Category:.*common" src/ | cut -d ':' -f1 | xargs ls -l 2.6.5
-rw-r--r-- 1 jgoebel staff 1550 Oct 14 17:56 src//languages/apache.js
-rw-r--r-- 1 jgoebel staff 2570 Oct 14 17:56 src//languages/bash.js
-rw-r--r-- 1 jgoebel staff 4160 Oct 14 17:56 src//languages/coffeescript.js
-rw-r--r-- 1 jgoebel staff 6896 Oct 14 17:56 src//languages/cpp.js
-rw-r--r-- 1 jgoebel staff 5688 Oct 14 17:56 src//languages/cs.js
-rw-r--r-- 1 jgoebel staff 2881 Oct 14 17:56 src//languages/css.js
-rw-r--r-- 1 jgoebel staff 1045 Oct 14 17:56 src//languages/diff.js
-rw-r--r-- 1 jgoebel staff 732 Oct 14 18:02 src//languages/dockerfile.js
-rw-r--r-- 1 jgoebel staff 1692 Oct 14 18:00 src//languages/go.js
-rw-r--r-- 1 jgoebel staff 1179 Oct 14 17:56 src//languages/http.js
-rw-r--r-- 1 jgoebel staff 1802 Oct 14 17:56 src//languages/ini.js
-rw-r--r-- 1 jgoebel staff 3431 Oct 14 17:56 src//languages/java.js
-rw-r--r-- 1 jgoebel staff 5923 Oct 14 17:56 src//languages/javascript.js
-rw-r--r-- 1 jgoebel staff 1327 Oct 14 17:56 src//languages/json.js
-rw-r--r-- 1 jgoebel staff 6414 Oct 14 18:02 src//languages/kotlin.js
-rw-r--r-- 1 jgoebel staff 5022 Oct 14 18:01 src//languages/less.js
-rw-r--r-- 1 jgoebel staff 2965 Oct 14 18:01 src//languages/lua.js
-rw-r--r-- 1 jgoebel staff 2156 Oct 14 17:56 src//languages/makefile.js
-rw-r--r-- 1 jgoebel staff 2530 Oct 14 17:56 src//languages/markdown.js
-rw-r--r-- 1 jgoebel staff 2480 Oct 14 17:56 src//languages/nginx.js
-rw-r--r-- 1 jgoebel staff 3513 Oct 14 17:56 src//languages/objectivec.js
-rw-r--r-- 1 jgoebel staff 5050 Oct 14 17:56 src//languages/perl.js
-rw-r--r-- 1 jgoebel staff 3724 Oct 14 17:56 src//languages/php.js
-rw-r--r-- 1 jgoebel staff 209 Oct 14 17:56 src//languages/plaintext.js
-rw-r--r-- 1 jgoebel staff 1850 Sep 24 00:04 src//languages/properties.js
-rw-r--r-- 1 jgoebel staff 3159 Oct 14 17:56 src//languages/python.js
-rw-r--r-- 1 jgoebel staff 1873 Oct 14 18:02 src//languages/r.js
-rw-r--r-- 1 jgoebel staff 5314 Oct 14 17:56 src//languages/ruby.js
-rw-r--r-- 1 jgoebel staff 3523 Oct 14 18:03 src//languages/rust.js
-rw-r--r-- 1 jgoebel staff 7337 Oct 14 18:01 src//languages/scss.js
-rw-r--r-- 1 jgoebel staff 363 Sep 24 00:04 src//languages/shell.js
-rw-r--r-- 1 jgoebel staff 14995 Oct 14 17:56 src//languages/sql.js
-rw-r--r-- 1 jgoebel staff 5313 Oct 14 18:00 src//languages/swift.js
-rw-r--r-- 1 jgoebel staff 5673 Oct 14 18:03 src//languages/typescript.js
-rw-r--r-- 1 jgoebel staff 3053 Oct 14 17:56 src//languages/xml.js
-rw-r--r-- 1 jgoebel staff 2664 Oct 14 17:56 src//languages/yaml.js
Apache and Nginx seems a little obscure perhaps (as languages), but the sizes are pretty small.
@egor-rogov Any issue on renaming a few? I was going to rename ini
to toml
(the superset) but then I worried about breaking links... but people link to a VERSION on a cdn... so if they want to upgrade they have to bump the version # anyways so maybe that's an opportunity for them to read the release notes and see we renamed something?
Not super important, just a thought.
@yyyc514 I wouldn't break compatibility with no good reason.
-rw-r--r-- 1 jgoebel staff 130977 Oct 14 19:05 highlight.medium.pack.js
-rw-r--r-- 1 jgoebel staff 71161 Oct 14 19:05 highlight.pack.js
And by modern standards both those look tiny (just packed, not even gzipped)
Closing this in favor fo the new issue: https://github.com/highlightjs/highlight.js/issues/2206
Cc: @Sannis, @sourrust
Hi guys,
I've had a few ideas about updating the :common set of languages to better match reality and expectations.
Here's the download stats from the site, by language:
The 500+ top are the current :common set (with "dts" being there by mistake, we've missed it during review).
I've got two alternative ideas:
What do you think?