highlightjs / highlight.js

JavaScript syntax highlighter with language auto-detection and zero dependencies.
https://highlightjs.org/
BSD 3-Clause "New" or "Revised" License
23.7k stars 3.6k forks source link

Stan syntax updates #3410

Closed spinkney closed 2 years ago

spinkney commented 2 years ago

Information

Description Lots of updates to Stan over the last few months

Type updates:

The higher order functions that are valid are:

      ## Algebraic equation solver
      "algebra_solver", "algebra_solver_newton",

      ## Ordinary differential equation
      "ode_rk45", "ode_rk45_tol", "ode_ckrk", "ode_ckrk_tol", "ode_adams",
      "ode_adams_tol", "ode_bdf", "ode_bdf_tol", "ode_adjoint_tol_ctl",

      ## 1D integrator
      "integrate_1d",

      ## Reduce-sum function
      "reduce_sum", "reduce_sum_static",

It's probably easier to just have all the functions. You can view the updated recent pr request in the rouge library

Long list of functions # Integer-Valued Basic Functions ## Absolute functions "abs", "int_step", ## Bound functions "min", "max", ## Size functions "size", # Real-Valued Basic Functions ## Log probability function "target", "get_lp", ## Logical functions "step", "is_inf", "is_nan", ## Step-like functions "fabs", "fdim", "fmin", "fmax", "fmod", "floor", "ceil", "round", "trunc", ## Power and logarithm functions "sqrt", "cbrt", "square", "exp", "exp2", "log", "log2", "log10", "pow", "inv", "inv_sqrt", "inv_square", ## Trigonometric functions "hypot", "cos", "sin", "tan", "acos", "asin", "atan", "atan2", ## Hyperbolic trigonometric functions "cosh", "sinh", "tanh", "acosh", "asinh", "atanh", ## Link functions "logit", "inv_logit", "inv_cloglog", ## Probability-related functions "erf", "erfc", "Phi", "inv_Phi", "Phi_approx", "binary_log_loss", "owens_t", ## Combinatorial functions "beta", "inc_beta", "lbeta", "tgamma", "lgamma", "digamma", "trigamma", "lmgamma", "gamma_p", "gamma_q", "binomial_coefficient_log", "choose", "bessel_first_kind", "bessel_second_kind", "modified_bessel_first_kind", "log_modified_bessel_first_kind", "modified_bessel_second_kind", "falling_factorial", "lchoose", "log_falling_factorial", "rising_factorial", "log_rising_factorial", ## Composed functions "expm1", "fma", "multiply_log", "ldexp", "lmultiply", "log1p", "log1m", "log1p_exp", "log1m_exp", "log_diff_exp", "log_mix", "log_sum_exp", "log_inv_logit", "log_inv_logit_diff", "log1m_inv_logit", ## Special functions "lambert_w0", "lambert_wm1", ## Complex Conversion Functions "get_real", "get_imag", # Complex-Valued Basic Functions ## Complex Construction Functions "to_complex", # Array Operations ## Reductions "sum", "prod", "log_sum_exp", "mean", "variance", "sd", "distance", "squared_distance", "quantile", ## Array size and dimension function "dims", "num_elements", ## Array broadcasting "rep_array", ## Array concatenation "append_array", ## Sorting functions "sort_asc", "sort_desc", "sort_indices_asc", "sort_indices_desc", "rank", ## Reversing functions "reverse", # Matrix Operations ## Integer-valued matrix size functions "num_elements", "rows", "cols", ## Dot products and specialized products "dot_product", "columns_dot_product", "rows_dot_product", "dot_self", "columns_dot_self", "rows_dot_self", "tcrossprod", "crossprod", "quad_form", "quad_form_diag", "quad_form_sym", "trace_quad_form", "trace_gen_quad_form", "multply_lower_tri_self_transpose", "diag_pre_multiply", "diag_post_multiply", ## Broadcast functions "rep_vector", "rep_row_vector", "rep_matrix", "symmetrize_from_lower_tri", ## Diagonal matrix functions "add_diag", "diagonal", "diag_matrix", "identity_matrix", ## Container construction functions "linspaced_array", "linspaced_int_array", "linspaced_vector", "linspaced_row_vector", "one_hot_int_array", "one_hot_array", "one_hot_vector", "one_hot_row_vector", "ones_int_array", "ones_array", "ones_vector", "ones_row_vector", "zeros_int_array", "zeros_array", "zeros_vector", "zeros_row_vector", "uniform_simplex", ## Slicing and blocking functions "col", "row", "block", "sub_col", "sub_row", "head", "tail", "segment", ## Matrix concatenation "append_col", "append_row", ## Special matrix functions "softmax", "log_softmax", "cumulative_sum", ## Covariance functions "cov_exp_quad", ## Linear algebra functions and solvers "mdivide_left_tri_low", "mdivide_right_tri_low", "mdivide_left_spd", "mdivide_right_spd", "matrix_exp", "matrix_exp_multiply", "scale_matrix_exp_multiply", "matrix_power", "trace", "determinant", "log_determinant", "inverse", "inverse_spd", "chol2inv", "generalized_inverse", "eigenvalues_sym", "eigenvectors_sym", "qr_thin_Q", "qr_thin_R", "qr_Q", "qr_R", "cholseky_decompose", "singular_values", "svd_U", "svd_V", # Sparse Matrix Operations ## Conversion functions "csr_extract_w", "csr_extract_v", "csr_extract_u", "csr_to_dense_matrix", ## Sparse matrix arithmetic "csr_matrix_times_vector", # Mixed Operations "to_matrix", "to_vector", "to_row_vector", "to_array_2d", "to_array_1d", # Higher-Order Functions ## Algebraic equation solver "algebra_solver", "algebra_solver_newton", ## Ordinary differential equation "ode_rk45", "ode_rk45_tol", "ode_ckrk", "ode_ckrk_tol", "ode_adams", "ode_adams_tol", "ode_bdf", "ode_bdf_tol", "ode_adjoint_tol_ctl", ## 1D integrator "integrate_1d", ## Reduce-sum function "reduce_sum", "reduce_sum_static", ## Map-rect function "map_rect", # Deprecated Functions "integrate_ode_rk45", "integrate_ode", "integrate_ode_adams", "integrate_ode_bdf", # Hidden Markov Models "hmm_marginal", "hmm_latent_rng", "hmm_hidden_state_prob" ]

The BNF grammars page was updated. I see that this is referenced in the prism stan code. The updated file is stan bnf grammars 2.28.

spinkney commented 2 years ago

Also, we need to make sure that complex numbers are highlighted correctly vscode shows this:

Screen Shot 2021-11-28 at 9 03 18 AM

issues:

joshgoebel commented 2 years ago

Sounds about right at a glance. You willing to work on a PR for this, @spinkney ?

spinkney commented 2 years ago

@joshgoebel I can take a stab, it doesn't look too hard.

spinkney commented 2 years ago

I have a fork with the updates https://github.com/spinkney/highlight.js/blob/main/src/languages/stan.js. The build runs. However, when I try to test with the developer.html stan is not in the language drop down list.

joshgoebel commented 2 years ago

You ned to build it:

node ./tools/build.js -t browser :common stan

joshgoebel commented 2 years ago

We also now have title.function and title.class... mostly-deprecating plain old title... and perhaps some of those built-ins should be title.function? (we don't have super clear guidance on built-in vs title.function...)

Thoughts?

spinkney commented 2 years ago

Thanks, do you have any pointers how to add suffixes that are valid for a list of words?

So I have a bunch of distributions and they can have different suffixes. I don't want to have 5 normal distributions where each is normal_a, normal_b, etc. like it was in the previous file. I want to specify normal and then the valid endings

joshgoebel commented 2 years ago

. I don't want to have 5 normal distributions

Actually we often prefer it this way (for maintenance and readability) - if the number of variations is smaller - and that would definitely be the case with 5.

If there are a lot more than 5 then you'd need to use a custom mode and use regex to do the matching.

joshgoebel commented 2 years ago

For example:

    'normal_cdf',
    'normal_lccdf',
    'normal_lcdf',
    'normal_lpdf',
    'normal_rng',

Breaking this out is more trouble than it's worth because keywords and modes don't behave exactly the same so it's better to just have the longer list of keywords.... which compression will smash and keyword look-up is O(1-ish), so there isn't a lot of harm done.

spinkney commented 2 years ago

ok but then I'll have to do it for every distribution and when a new _bar is added

spinkney commented 2 years ago

there's also _lupdf and soon there will be _qf

joshgoebel commented 2 years ago

You can build portions of the keyword list programmatically if it will save a lot of time/thinking:

const DISTRIBUTIONS = ["normal", "abnormal", ...];
const expanded = DISTRIBUTIONS.flatMap(name => [
  `${name}_lupdf`,
  `${name}_qd`,
]
) 

Very little magic, no regex, and it's still a simple array when finished.

spinkney commented 2 years ago

Do I just reference expanded? I'm not getting the list built out from this and I assume it's something silly.

vscode is also complaining about flatMap

Property 'flatMap' does not exist on type 'string[]'. Do you need to change your target library? Try changing the 'lib' compiler option to 'es2019' or later.ts(2550)
joshgoebel commented 2 years ago

Dunno, it's newish... https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/flatMap

I'd suggest doing as it says: Try changing the 'lib' compiler option to 'es2019'

joshgoebel commented 2 years ago

Do I just reference expanded?

Well after expanding you're probably going to add it back into the larger list with destructuring, etc...

spinkney commented 2 years ago

Great, things are looking good. I'm just having trouble getting a different color for control flow keywords (for|in|if|else|while|break|continue|return).

I've tried putting it in keywords and having it broken out in the contains section. Right now the keywords in the flow class is commented out. If I uncomment I get the same highlighting as VAR_TYPES.

spinkney commented 2 years ago

PR is in and ready for review

spinkney commented 2 years ago

@joshgoebel this almost gets me the UDFs and the language defined distributions but the UDF distributions have a different color than the distributions in keywords. I want both of them the same color as the built-in class, is there a way to achieve this?

   {
      className: 'built_in',
      begin: '\\s*(' + hljs.IDENT_RE + ')(?=\\()',
      keywords: DISTRIBUTIONS.concat(DISTRIBUTIONS_EXPANDED)
    }

Edit: I got it with:

   {
      className: 'built_in',
      begin: '\\s*(' + hljs.IDENT_RE + ')(?=\\()',
      built_in: DISTRIBUTIONS.concat(DISTRIBUTIONS_EXPANDED)
    }