Open a-velasco opened 4 weeks ago
Issue 2 resolved: We had not deleted the /base-content
tag when resetting the docs.
We made sure to delete it completely before running the action, and it worked as expected (imported all documentation pages)
RE: Issue 1
I think this might be due to the logic in the _calculate_contents_hierarchy
function.
When checking if the current item has a lower whitespace count than expected, it seems to assume the item is just one level away, so it always corrects the hierarchy by subtracting 1.
In a case where it is two or more levels away, its hierarchy value will be incorrect, so the _check_contents_item
function will complain that this item isn't part of the group that its hierarchy indicates (i.e. raises that InputError
I quoted in the OP)
I've had some success by modifying _calculate_contents_hierarchy
like so:
while item:
# All items in the current directory have been processed
if item.whitespace_count < whitespace_expectation_per_level[hierarchy]:
+ if item.whitespace_count == 0:
+ hierarchy = 0
+ aggregate_dir = Path('.')
+ else:
+ hierarchy = hierarchy - 1
+ parent = parents.pop()
+ aggregate_dir = Path(parent.reference_value).parent
- hierarchy = hierarchy - 1
- parent = parents.pop()
- aggregate_dir = Path(parent.reference_value).parent
This passes the unit tests in tests/unit/test_index_contents_hierarchy.py
, and parses the contents as expected when I test it on the repository's CI.
I also added a new test case to simulate this scenario:
pytest.param(
(
item_1 := factories.IndexParsedListItemFactory(
whitespace_count=0, reference_value=(value_1 := "dir1")
),
item_2 := factories.IndexParsedListItemFactory(
whitespace_count=1, reference_value=(value_2 := f"{value_1}/dir2")
),
item_3 := factories.IndexParsedListItemFactory(
whitespace_count=2, reference_value=(value_3 := f"{value_2}/file3.md")
),
item_4 := factories.IndexParsedListItemFactory(
whitespace_count=0, reference_value=(value_4 := "dir4")
),
item_5 := factories.IndexParsedListItemFactory(
whitespace_count=1, reference_value=(value_5 :=f"{value_4}/file5.md")
),
),
("dir", "dir", "file", "dir", "file"),
(
factories.IndexContentsListItemFactory(
hierarchy=1,
reference_title=item_1.reference_title,
reference_value=value_1,
rank=item_1.rank,
),
factories.IndexContentsListItemFactory(
hierarchy=2,
reference_title=item_2.reference_title,
reference_value=value_2,
rank=item_2.rank,
),
factories.IndexContentsListItemFactory(
hierarchy=3,
reference_title=item_3.reference_title,
reference_value=value_3,
rank=item_3.rank,
),
factories.IndexContentsListItemFactory(
hierarchy=1,
reference_title=item_4.reference_title,
reference_value=value_4,
rank=item_4.rank,
),
factories.IndexContentsListItemFactory(
hierarchy=2,
reference_title=item_5.reference_title,
reference_value=value_5,
rank=item_5.rank,
),
),
id="directory following a file nested twice",
)
@jdkandersson I would love to hear your thoughts about this and get some more insights into how the hierarchy calculation is designed! I haven't raised a PR because I'm not sure if this is the best approach, and I feel like there are probably more considerations to take into account when modifying this logic.
I'm taking a look now
I think the reduction of level by 1 shouldn't be the problem since, if the whitespace count is still lower than expected, on the next loop it should be popped again. I haven't had the time yet, I would write a test potentially here: https://github.com/canonical/discourse-gatekeeper/blob/50d13cbb5123dd25f87896975044f2e3b3bdf5ca/tests/unit/test_index_contents_parse.py#L522 with the case and see if it fails and check why
actually, you already wrote the test, I'll use that as a starting point and see if I can fix the issue
Ok, also got the test failure, now need to check why
Part of the reason why I'm not sure we should use the proposed code is that it only takes care of the case where the whitespace count is 0, it should also be solved for cases where there is a 2 or more drop in hierarchy when the parent isn't at the root level
@jdkandersson Thanks a lot for taking a look!
I wrote that code to check whether this was the part of the logic that was failing for multiple level jumps, but if you agree that this is where the hierarchy mis-calculation is happening, then I definitely agree that it should be generalized for other cases.
I'll try to come up with a more clean and general solution, and submit a PR.
The solution is quite simple:
if item.whitespace_count < whitespace_expectation_per_level[hierarchy]:
hierarchy = hierarchy - 1
parent = parents.pop()
aggregate_dir = Path(parent.reference_value).parent
continue
Just needed to add the continue
so that the if statement is evaluated again rather than processing continues after only going up one hierarchy
Would you like to raise a PR with this and the new test case you have written?
Thanks for finding, reporting and even creating a test case for this!
Ahh no way, I reached almost the same solution but thought the aggregate directory calculation had to be different. Thanks! I'll be happy to raise the PR with the solution + test case :)
Issues
index.md
table of contents.Running the action on a repo with noSolved: see this comment./docs
folder creates a complete table of contents inindex.md
, but only imports some of those pages into the repo.Steps taken
The repository has an existing
/docs
folder with all the pages published to Charmhub. This migration was done by an old fork of the discourse-gatekeeper action.Issue 1: Table of contents nesting
We updated our workflow from the fork to
canonical/discourse-gatekeeper@stable
. Then, we re-triggered the workflow with no changes to discourse or GitHub.The action failed due to an index/ToC related error (link to logs):
It fails on this item of the contents table.
The same error happens with
canonical/discourse-gatekeeper@main
.Issue 2 (solved): Action does not import all documentation pages
See this comment