The function hltest() appears broken

The function hltest() does not seem to work properly, the p-values it returns seem impossibly small. Take for instance the following setup, with independent y (response) and x (predictor):

y = sample(0:1, 1000, repl=TRUE)
x = rnorm(1000)
m = glm(y~x, family='binomial')

The call hltest(m) produces the following (up to randomness):

The Hosmer-Lemeshow goodness-of-fit test

 Group Size Observed Expected
     1  100       48    4.711
     2  100       47    4.739
     3  100       45    4.752
     4  100       50    4.764
     5  100       50    4.775
     6  100       45    4.786
     7  100       44    4.796
     8  100       49    4.807
     9  100       49    4.822
    10  100       51    4.849

         Statistic =  4077.07 
degrees of freedom =  8 
           p-value =  < 2.22e-16

Compare this with performance::performance_hosmer(m), which produces

# Hosmer-Lemeshow Goodness-of-Fit Test

  Chi-squared: 1.986
           df: 8    
      p-value: 0.981

and vcdExtra::HLtest(m), which yields

Hosmer and Lemeshow Goodness-of-Fit Test 

Call:
glm(formula = y ~ x, family = "binomial")
 ChiSquare df   P_value
  1.985651  8 0.9814486

Both of the latter are the same and much more sensible (i.e., non-rejection) outputs.

Are these not the same Hosmer and Lemeshow tests?

cran / glmtoolbox

The function hltest() appears broken #1