Closed edublancas closed 2 years ago
On the second bullet. @edublancas who defines what's a few? Also what's invalid in it that we need an error? Imo the user and maybe it's a valid notebook for them? I assume you meant H2s since we don't support H1s currently. Unless it's a 300 lines notebook and there's no or maybe 1-2 headers and then it make sense to output something to the user, but then it's pretty similar to https://github.com/ploomber/soorgeon/issues/47
I'd say:
I updated the original description
I am working on implementing the following logic. Just to confirm if my understanding is correct.
Only H1 headings are found. At this time, only H2 heading are supported.
Expected notebook to have at least one markdown H2 heading.
refactoring successful but only one H2 heading detected...
I will create _get_h1_header
method, similar to _get_h2_header
. But I think I will need to change the regex in _get_h2_header
to ^\s*##\s+(.+)
(add ^
). Otherwise \s*##\s+(.+)
will match ###
.
And the regex in the new method _get_h1_header
will be ^\s*#\s+(.+)
.
I will start to write tests for test_get_h2_header
and test_get_h1_header
.
Sounds about right, Only one H2 header will be dealt in a different PR. The function sounds good, just validate the regex actually matches to H1s (I tried here on python and seems it doesn't https://regex101.com/).
What's the input that fails the match? I tried the following cases and they all passed
@pytest.mark.parametrize('md, expected', [
['# Header', 'header'],
['# H1\n## H2', 'h1'],
[' \t # H1', 'h1'],
[' ## H2', None],
['something', None],
['# Something\nignore me', 'something'],
])
def test_get_h1_header(md, expected):
assert split._get_h1_header(md) == expected
^\s*##\s+(.+)
Is it working with this pattern? Also I tried it online and not locally with a py console, that might be failing due to the parser.
Yes, it is.
What's the input? I tried ## 1
as the input online and it worked.
Small comment @94rain did we cover the case where no markdowns are found at all? Didn't see any in the test cases.
Does test_find_breaks_error_if_no_h1_and_h2_headers
count? Expected notebook to have at least one markdown H2 heading.
(No H1 and H2 headers)
This does not check whether or not there are other markdown texts.
Not really since there's still markdown in the string you're checking, try having a sample with a json or plain-text
Not really since there's still markdown in the string you're checking, try having a sample with a json or plain-text
OK. Just additional tests? Or do we expect the error message to be different to Expected notebook to have at least one markdown H2 heading.
?
Yes exactly!
We should validate the input notebook: