con / opfvta-reexecution

Container-based Replication of https://doi.org/10.1038/s41398-022-01812-5
Apache License 2.0
1 stars 1 forks source link

`ValueError` when running on discovery #29

Closed TheChymera closed 11 months ago

TheChymera commented 1 year ago

As reported by Austin we ran into another error on discovery (stderr, stdout).

The crux of it seems to be that the ANOVA is failing:

----  Messages for py:session9:default  ----
  Traceback (most recent call last):
* PythonTeX stderr - error on line 17 in "results.tex":
    File "<outputdir>/py_session9_default.py", line 429, in <module>
      print(pytex.formatter(boilerplate.anova(factor='C(Q("PA rel. Bregma [mm]"))')))
    File "/opt/src/opfvta/lib/boilerplate.py", line 143, in anova
      summary = sm.stats.anova_lm(ols, typ=typ, robust='hc3')
    File "/usr/lib/python3.10/site-packages/statsmodels/stats/anova.py", line 349, in anova_lm
      return anova_single(model, **kwargs)
    File "/usr/lib/python3.10/site-packages/statsmodels/stats/anova.py", line 83, in anova_single
      return anova3_lm_single(model, design_info, n_rows, test, pr_test,
    File "/usr/lib/python3.10/site-packages/statsmodels/stats/anova.py", line 252, in anova3_lm_single
      f = model.f_test(L12, cov_p=cov)
    File "/usr/lib/python3.10/site-packages/statsmodels/base/model.py", line 1765, in f_test
      res = self.wald_test(r_matrix, cov_p=cov_p, invcov=invcov, use_f=True, scalar=True)
    File "/usr/lib/python3.10/site-packages/statsmodels/base/model.py", line 1841, in wald_test
      LC = DesignInfo(names).linear_constraint(r_matrix)
    File "/usr/lib/python3.10/site-packages/patsy/design_info.py", line 536, in linear_constraint
      return linear_constraint(constraint_likes, self.column_names)
    File "/usr/lib/python3.10/site-packages/patsy/constraint.py", line 419, in linear_constraint
      return LinearConstraint(variable_names, coefs)
    File "/usr/lib/python3.10/site-packages/patsy/constraint.py", line 62, in __init__
      raise ValueError("must have at least one row in constraint matrix")
  ValueError: must have at least one row in constraint matrix

It's reported via TeX, but this is in the end a Python error, caused by either:

  1. Missing data on account of quiet preprocessing errors
  2. Updates in the statsmodels/patsy packages.

Since we pin the versions, I'm pretty sure it's (1), but I'll need to meet with @asmacdo again and check.

TheChymera commented 1 year ago

@asmacdo if you could check the lengths of our stats summary files on discovery:

[dark]~/docsrc/opfvta ❱ for i in data/*csv; do wc -l ${i}; done
18 data/features_structural.csv
79 data/functional_significance.csv
79 data/functional_t.csv
34 data/groups.csv
10 data/implant_coordinates_block.csv
10 data/implant_coordinates.csv
10 data/implant_coordinates_phasic.csv

That would give us a good indicator whether data was dropped on discovery. If your numbers are larger, that just means my last local run dropped some data.

asmacdo commented 11 months ago

This was fixed way back