gnu-octave / octave-doctest

Doctests for Octave/Matlab
https://gnu-octave.github.io/packages/doctest/
BSD 3-Clause "New" or "Revised" License
16 stars 4 forks source link

Test-abort error for Tablicious, maybe around classes in namespaces, or categorical.missing #283

Open apjanke opened 9 months ago

apjanke commented 9 months ago

Hi, doctest folks,

I'm working on the Tablicious package, and I'd like to make it "doctest clean", so it passes doctest, even if that doesn't do significant exercising of its code.

When running doctest inst against Tablicious HEAD of main a bit after v0.4.2, I get this "class not found: tblish.internal.dataset" error that seems to abort the whole doctest run.

Repro code:

# repro_doctest_tblish_error.m
# Reproduce apjanke's doctest errors for Tablicious

addpath ~/repos/octave-tablicious/inst
addpath ~/repos/octave-doctest/inst

# Not all the way in to inst; want to rely on regular octave path and not cwd.
cd ~/repos/octave-tablicious

doctest inst

I get:

>> version
ans = 8.4.0
>> pwd
ans = /Users/janke
>>
>> repro_doctest_tblish_error
Doctest v0.8.0+: this is Free Software without warranty, see source.

inst/
  +tblish/
    +examples/
      +internal/
        +datasets/
          AirPassengers_1.m ............................ NO TESTS
          BJsales_1.m .................................. NO TESTS
          BOD_1.m ...................................... NO TESTS
          ChickWeight_1.m .............................. NO TESTS
          DNase_1.m .................................... NO TESTS
          EuStockMarkets_1.m ........................... NO TESTS
[...]
  calendarDuration.vertcat ............................. NO TESTS
error: unknown package 'categorical'
error: called from
    <unknown>
    doctest_collect>extract_docstring at line 367 column 25
    doctest_collect>collect_targets_class at line 330 column 38
    doctest_collect at line 167 column 11
    doctest_collect at line 147 column 13
    doctest at line 356 column 11
    repro_doctest_tblish_error at line 13 column 1
>>
>> pwd
ans = /Users/janke/repos/octave-tablicious/inst
>>

Note that it ends up in a different directory than I started in, too. With a dbstop if error, stopped at that same place:

debug> name
name = categorical.missing
debug> dbup
stopped in doctest_collect>collect_targets_class at line 330 [/Users/janke/repos/octave-doctest/inst/private/doctest_collect.m]
debug> dbup
stopped in doctest_collect at line 167 [/Users/janke/repos/octave-doctest/inst/private/doctest_collect.m]
debug> w
w = @categorical
debug> depth
depth = 1
debug>

I'm thinking maybe this is a problem with support for methods in classes, classes in namespaces, some combination of both, or maybe I'm just writing my texinfo sections wrong?

I ran this against commit f0e346ab97eba11462d510305550bc371e797f4d in Tablicious (https://github.com/apjanke/octave-tablicious/commit/f0e346ab97eba11462d510305550bc371e797f4d).

I'm trying to make a minimal repro case in branch WIP/repro-categorical-missing-error in repo apjanke/octo-docto-repro, but haven't had luck yet.

References

Here's my own ticket in Tablicious about ~this: https://github.com/apjanke/octave-tablicious/issues/129

@mmuetzel originally reported something like this in https://github.com/gnu-octave/packages/pull/401. Though there it was failing on something different, and in some of my other tests, it is choking on categorical.missing (a static method in Tablicious's categorical class) instead of the tblish.internal.dataset error I'm getting here.

mmuetzel commented 9 months ago

~~Is this a typo in the docstring or code? Some folders appear to be named +datasets. But it is complaining about tblish.internal.dataset. Should it be tblish.internal.datasets (with the "s") instead?~~

Edit: Nevermind. I mixed up the +folder and the .m file of similar names.

apjanke commented 8 months ago

Yeah, there's all of tblish.dataset, tblish.datasets, tblish.internal.dataset, and tblish.internal.datasets. I should probably reduce that name duplication. Especially since Matlab has a dataset class too, and none of these are the same as that.

One thing the tblish.internal.dataset class and categorical.missing method have in common is that the last component of their names - "dataset" and "missing" - is the name of a class that core Octave doesn't supply, but has a missing-function hook for.

I think it's something to do with Octave's get_help_text behavior. In a Fresh octave session, when I don't have Tablicious loaded:

>> get_help_text ('categorical')
ans =
>> get_help_text ('categorical.missing')
ans =
>> get_help_text ('tblish.internal.dataset')
ans =
>> get_help_text ('tblish.internal.datasets')
ans =
>>

But with Tablicious loaded, get_help_text ('categorical.missing') raises an actual error, instead of returning empty or some text. Doesn't seem to affect tblish.internal.dataset, though.

>> addpath ~/repos/octave-tablicious/inst
>> get_help_text ('categorical.missing')
error: unknown package 'categorical'
error: called from
    <unknown>
>> get_help_text ('tblish.internal.dataset')
ans = DATASET An example dataset, managed by Tablicious’ datasets mechanism

 To use this, subclass it, and have your constructor populate
 the name and description fields, and implement load(), regenerate(),
 and possibly fetch().

 There are three typical ways to define a dataset:
   - All in code, in which case your subclass will just implement load()
   - As a stored, checked-in mat-file, in which case your subclass will
       implement regenerate_dataset() and load()
   - As a cached post-installation mat-file, in which case your subclass
       will implement cache_dataset() and load(), and make use of
       cache_file_path() in both of them.

>> get_help_text ('tblish.internal.datasets')
ans =
>>
>> get_help_text ('categorical')
ans =
 @deftp {Class} categorical

 Categorical variable array.

 A @code{categorical} array represents an array of values of a categorical
 variable. Each @code{categorical} array stores the element values along
[...]

On eilonwy, my macOS 14 M1 Mac (switched to that today), I loaded in octave-doctest (from its current HEAD of main, eb9429f) and did doctest inst from the octave-tablicious repo root dir. Now it's erroring on categorical.missing again.

>> dbstop if error
>> doctest inst
[...]

  calendarDuration.vertcat ............................. NO TESTS
error: unknown package 'categorical'
error: called from
    <unknown>
    doctest_collect>extract_docstring at line 367 column 25
    doctest_collect>collect_targets_class at line 330 column 38
    doctest_collect at line 167 column 11
    doctest_collect at line 147 column 13
    doctest at line 356 column 11
stopped in doctest_collect>extract_docstring at line 367 [/Users/janke/repos/octave-doctest/inst/private/doctest_collect.m]
367:     [docstring, format] = get_help_text(name);
debug>

Different stacktrace than for tblish.internal.dataset before (in my original post). Hmm. But the bottom stack frame is where it's calling get_help_text() on 'categorical.missing'.

Same new stacktrace if I run it using the published OF doctest package, with pkg install -forge doctest.

  calendarDuration.vertcat ............................. NO TESTS
error: unknown package 'categorical'
error: called from
    <unknown>
    doctest_collect>extract_docstring at line 367 column 25
    doctest_collect>collect_targets_class at line 330 column 38
    doctest_collect at line 167 column 11
    doctest_collect at line 147 column 13
    doctest at line 356 column 11
stopped in doctest_collect>extract_docstring at line 367 [/Users/janke/Library/Application Support/Octave.app/8.4.0/pkg/doctest-0.8.0/private/doctest_collect.m]
367:     [docstring, format] = get_help_text(name);
debug>

...aaaand I can reproduce the stacktrace from my original post here if I fire up a new Octave session and just forget to actually add Tablicious to the Octave path before running cd-ing to it and running doctest inst on it.

>> cd repos/octave-tablicious/
>> addpath ~/repos/octave-doctest/inst
>> dbstop if error
>> doctest inst
Doctest v0.8.0+: this is Free Software without warranty, see source.

inst/
  +tblish/
    +examples/
[...]
      +datasets/
        @AirPassengers/
error: class not found: tblish.internal.dataset
error: called from
    <unknown>
    doctest_collect>extract_docstring at line 367 column 25
    doctest_collect>collect_targets_function at line 271 column 36
    doctest_collect at line 163 column 10
    doctest_collect at line 147 column 13
    doctest at line 356 column 11
stopped in doctest_collect>extract_docstring at line 367 [/Users/janke/repos/octave-doctest/inst/private/doctest_collect.m]
367:     [docstring, format] = get_help_text(name);
debug> pwd
ans = /Users/janke/repos/octave-tablicious/inst/+tblish/+internal/+datasets/@AirPassengers
debug>

So that part at least is user error on my part. I'll update this ticket's description with a fixed repro.

Looks like here it's cd-ing around, it's operating on and not recognizing that AirPassengers is namespaced, and it's trying to grab the help text for that, not for the tblish.internal.dataset that shows up in the error mesage.

debug> name
name = AirPassengers.m
ebug> dbup
stopped in doctest_collect>collect_targets_function at line 271 [/Users/janke/repos/octave-doctest/inst/private/doctest_collect.m]
debug> dbup
stopped in doctest_collect at line 163 [/Users/janke/repos/octave-doctest/inst/private/doctest_collect.m]
debug> w
w = AirPassengers.m
debug> type
type = function
debug> pwd
ans = /Users/janke/repos/octave-tablicious/inst/+tblish/+internal/+datasets/@AirPassengers
debug>

I think at that point, because of the cd, the AirPassengers.m file is on the path because . is on the path, but there it appears as a non-namespaced class, ignoring the +tblish/+internal/+datasets above it. That affects whether get_help_text raises an error here.

debug> cd ..
debug> cd ..
debug> cd ..
debug> pwd
ans = /Users/janke/repos/octave-tablicious/inst/+tblish
debug> get_help_text ('AirPassengers.m')
ans =
debug> cd +internal
debug> cd +datasets/@AirPassengers
debug> pwd
ans = /Users/janke/repos/octave-tablicious/inst/+tblish/+internal/+datasets/@AirPassengers
debug> get_help_text ('AirPassengers.m')
error: class not found: tblish.internal.dataset
error: called from
    <unknown>
    doctest_collect>extract_docstring at line 1 column 7
    doctest_collect>collect_targets_function at line 271 column 36
    doctest_collect at line 163 column 10
    doctest_collect at line 147 column 13
    doctest at line 356 column 11
debug>

Now, tblish.internal.datasets.AirPassengers and all the other classes in the tblish.internal.datasets inherit from tblish.internal.dataset. In AirPassengers.m:

classdef AirPassengers < tblish.internal.dataset
  # Monthly Airline Passenger Numbers 1949-1960

So I think what's happening here (in my user-error case, not overall) is get_help_text('AirPassengers.m') is causing Octave to load that m-file and try to parse/compile the class, and that's failing because the inst dir of Tablicious is not on the path and is not the cwd, so the definition of tblish.internal.dataset is not visible to Octave, and loading the class (inside get_help_text) fails, and it errors out.

Still would like to figure out the "regular" error case when Tablicious is properly loaded.


My ticket for handling the "dataset" naming ambiguity:

apjanke commented 8 months ago

A bit of diagnosis, stopped at that "error: unknown package 'categorical'" error inside doctest inst. From there, called get_help_text on various names.

debug> dbstack
stopped in:

  -->     doctest_collect>extract_docstring at line 367 [/Users/janke/repos/octave-doctest/inst/private/doctest_collect.m]
      doctest_collect>collect_targets_class at line 330 [/Users/janke/repos/octave-doctest/inst/private/doctest_collect.m]
                            doctest_collect at line 167 [/Users/janke/repos/octave-doctest/inst/private/doctest_collect.m]
                            doctest_collect at line 147 [/Users/janke/repos/octave-doctest/inst/private/doctest_collect.m]
                                    doctest at line 356 [/Users/janke/repos/octave-doctest/inst/doctest.m]
                 repro_doctest_tblish_error at line 13 [/Users/janke/Documents/octave-stuff/repro-doctest/repro_doctest_tblish_error.m]
debug> get_help_text ('categorical.missing')
error: unknown package 'categorical'
error: called from
    <unknown>
    doctest_collect>extract_docstring at line 367 column 25
[...]

debug> get_help_text ('categorical')
ans =
 @deftp {Class} categorical

 Categorical variable array.

 A @code{categorical} array represents an array of values of a categorical
 variable. Each @code{categorical} array stores the element values along
[...]

debug> get_help_text ('categorical.circshift')
ans =
 @deftypefn  {} {@var{y} =} circshift (@var{x}, @var{n})
 @deftypefnx {} {@var{y} =} circshift (@var{x}, @var{n}, @var{dim})
 Circularly shift the values of the array @var{x}.

 @var{n} must be a vector of integers no longer than the number of dimensions
 in @var{x}.  The values of @var{n} can be either positive or negative, which
[...]

debug> get_help_text ('categorical.asgn')
ans = ASGN Assign array elements by indexes.
 This is what you call internally inside the class instead of doing
[...]

debug> get_help_text ('categorical.codes_with_nans')
ans =
debug> get_help_text ('categorical.subset')
ans = SUBSET Subset array by indexes.
 This is what you call internally inside the class instead of doing
 ()-indexing references on the RHS, which don't work properly inside the class
 because they don't respect the subsref() override.

debug> get_help_text ('categorical.parensRef')
ans = PARENSREF ()-indexing, for this class's internal use

debug>
[...]

debug> get_help_text ('categorical.sombrero')
ans =
 @deftypefn  {} {} sombrero ()
 @deftypefnx {} {} sombrero (@var{n})
 @deftypefnx {} {@var{z} =} sombrero (@dots{})
 @deftypefnx {} {[@var{x}, @var{y}, @var{z}] =} sombrero (@dots{})
 Plot the familiar 3-D sombrero function.

 The function plotted is

Kinda odd.

I get the same behavior when I fire up a fresh session and do the same get_help_text calls from the base workspace outside the debugger, with only Tablicious loaded and not doctest.

I tried adding a texinfo block to my categorical.circshift.

    ## -*- texinfo -*-
    ## @node categorical.circshift
    ## @deftypefn {Method} {@var{out} =} circshift (@code{this})
    ##
    ## Shift positions of elements circularly, for categoricals.
    ##
    ## @end deftypefn
    function this = circshift (this, varargin)
      #CIRCSHIFT Shift positions of elements circularly.

Same behavior; it returned the helptext for the core circshift.

>> get_help_text ('categorical.circshift')
ans =
 @deftypefn  {} {@var{y} =} circshift (@var{x}, @var{n})
 @deftypefnx {} {@var{y} =} circshift (@var{x}, @var{n}, @var{dim})
 Circularly shift the values of the array @var{x}.

 @var{n} must be a vector of integers no longer than the number of dimensions
 in @var{x}.  The values of @var{n} can be either positive or negative, which

To see if it was specific to the name, I added a duplicate circshift method, with just the name changed and a "dupe" note added to its helptext.

    ## -*- texinfo -*-
    ## @node categorical.circshift2
    ## @deftypefn {Method} {@var{out} =} circshift2 (@code{this})
    ##
    ## Shift positions of elements circularly, for categoricals (DUPE for testing).
    ##
    ## @end deftypefn
    function this = circshift2 (this, varargin)
      #CIRCSHIFT Shift positions of elements circularly (DUPE for testing).
      this.code = circshift (this.code, varargin{:});
      this.tfMissing = circshift (this.tfMissing, varargin{:});
    endfunction

For that "circshift2" one, I got my helptext.

> get_help_text ('categorical.circshift2')
ans =
 @node categorical.circshift2
 @deftypefn {Method} {@var{out} =} circshift2 (@code{this})

 Shift positions of elements circularly, for categoricals (DUPE for testing).

Speculation

Looks to me like core Octave's get_help_text isn't properly handling methods on classes when they have the same name as a built-in or global function, or something like that. And doctest is being affected by that due to its dependency on get_help_text.

Might be hard to do a full fix. Maybe doctest could throw a try/catch around the get_help_text calls and if they raise an error, turn that in to a failure result so the whole test still runs instead of just aborting? Then live with the incorrect helptext extraction until Octave supports classes and namespaces better there?

cbm755 commented 8 months ago

Sorry this is a bit TL-DR... apologies for skimming.