Closed rcalvo12 closed 1 year ago
Given that all the tests failed/were cancelled, it looks like editing test_build_latex.py
will be necessary.
@rcalvo12 Yes, thanks! Will have a look later today. (PS: I can probably help with the failing tests too, given they're failing on Ubuntu and I'm on Linux; ofc feel free to debug in the meantime.)
@rcalvo12 Did you run the tests locally? This doesn't seem to be specific to Linux; the tests as they are ought to fail generally. Just want to check this is the case.
Tagging @jmshapir since I think some of these overlap with how this builder ought to be designed. Some wrinkles I've found so far:
.bib
file, with a .bib
file, that BibTeX is run on the right target (i.e. build and not source), and that .bbl
/.blg
files are created if there is a .bib
file as input.biber
).subprocess.check_output
will fail. I believe the idea behind is is to make sure the builder is invoked only once, so that we can see the tests being run correspond to a single call to the builder. However, in this case the builder makes 4 calls. It should be possible to make multi-line calls but this may not be the best design.I'll make code-specific comments (if there are any) in my review. LMK if I should help with any of the above before then.
Thanks @rcalvo12 for your persistent work here!
I think all my in-line comments are addressed (conditional on @mcaceresb's confirmation where I tagged him). I think it might be good to focus on the few broader points in https://github.com/JMSLab/Template/pull/56#issuecomment-1086312640 and https://github.com/JMSLab/Template/pull/56#pullrequestreview-929480958.
I also think it might be a good idea to take @jmshapir's and @mcaceresb's opinion about the following point in https://github.com/JMSLab/Template/pull/56#pullrequestreview-929480958:
I wonder how we'd like to deal with cases where we have additional files that have references to the main text, such as an online Appendix. In that case, In these cases the order in which we compile the files would matter.
I am curious if we should have a builder that accounts for these cases. In theory, we can have cross-references across many files, and I am not sure about the architecture of a builder that can allow for an arbitrary number of auxiliary files such as online appendices. If we think that we should aim to account for the "most usual structure", then things might get simpler. If the most common structures are:
then a conditional logic to cover these would probably be simple. If we agree to have appendices in the same file as the main text as a lab practice, then we won't even need a conditional logic.
Thanks @veli-m-andirin! Regarding the issue of multiple TeX files that you highlight in https://github.com/JMSLab/Template/pull/56#issuecomment-1099613997, my suggestion is to leave this aside for the purpose of this pull.
If no one has a current use case, we can perhaps add it to the roadmap as a feature request.
If, now or later, either we in the lab or one of the developers has a use case, we can open an issue to tackle it. (I suspect there may be a way to handle arbitrary numbers of files but we might need to add a metadata file that tells the builder the order in which to compile. And we could always start with the simplest case as you say.)
Does that sound good?
Thanks @jmshapir! This sounds good to me. To be precise, the use case I had in mind was based on EventStudy where we had an appendix that cross-referenced the main text where the order of compilation mattered. That being said, I agree that having a builder that can compile a single file with a bibliography correctly would be a good first step.
If, now or later, either we in the lab or one of the developers has a use case, we can open an issue to tackle it. (I suspect there may be a way to handle arbitrary numbers of files but we might need to add a metadata file that tells the builder the order in which to compile. And we could always start with the simplest case as you say.)
To be precise, the use case I had in mind was based on EventStudy where we had an appendix that cross-referenced the main text where the order of compilation mattered. That being said, I agree that having a builder that can compile a single file with a bibliography correctly would be a good first step.
@jmshapir I would be fine either way we do this. I can take a look at a simple appendix cross-reference case and see if I can come up with a solution given the EventStudy use case @veli-m-andirin pointed out (Maybe budget no more than a couple of hours to look at it). I am also fine leaving this for another issue down the road.
Thanks @veli-m-andirin @rcalvo12!
My instinct is to leave the multi-TeX-file case for another issue.
@jmshapir Should tests be tackled in this PR or a later issue? If the latter, two notes:
LMK. My code comments are addressed but not the test comments I made here. However, if the tests are for a follow-up I'll approve as well.
@rcalvo12 @veli-m-andirin fyi
@jmshapir Should tests be tackled in this PR or a later issue?
@mcaceresb thanks. My instinct is that, before we merge this pull, it's probably good to have at least some minimal working tests for the functionality we've implemented here, including TeX+BibTeX. Let me know if that answers or if we should make a more fine-grained decision.
@rcalvo12 fyi.
@jmshapir Should tests be tackled in this PR or a later issue?
@mcaceresb thanks. My instinct is that, before we merge this pull, it's probably good to have at least some minimal working tests for the functionality we've implemented here, including TeX+BibTeX. Let me know if that answers or if we should make a more fine-grained decision.
@rcalvo12 fyi.
@jmshapir Sounds good.
@rcalvo12 If you know how to implement the above great; otherwise I am happy to go into more detail about how the tests here are structured.
@rcalvo12 If you know how to implement the above great; otherwise I am happy to go into more detail about how the tests here are structured.
@mcaceresb I spent some time looking over it, and I think some more detail about how the tests are structured would be extremely useful. Thanks!
@rcalvo12
The tests are based on the unittest
module, which you cans see as we are basing the test classes on unittest.TestCase
.
In general to add a test we define a function named def test_...
with the structure:
@[patch]
def test_[test](self, mock_[mock]):
# test code
This much I reckon is clear from reading the code, but I think figuring out what is happening beyond this becomes difficult. Let me explain this by dissecting the parts of the first LaTeX test:
@subprocess_patch
def test_default(self, mock_check_output):
'''
Test that build_latex() behaves correctly when provided with
standard inputs.
'''
mock_check_output.side_effect = fx.latex_side_effect
target = 'build/latex.pdf'
helpers.standard_test(self, build_latex, 'tex',
system_mock = mock_check_output,
source = ['test_script.tex'],
target = target)
self.assertTrue(os.path.isfile(target))
Why do all tests have a patch decorator and mock output? The first key thing to know is that we can only test the python code with the automated unit tests. In other words, we cannot in general run the entire repository (proprietary software such as Matlab and Stata is not available on-demand in gibhut servers; while we could run LaTeX, LyX, and R tests, since they are open source, for now we are only testing the python code itself).
The solution is to "pretend" that we are running them without actually invoking the command-line call. The @
decorator patches the system call so that it does not actually get invoked by the system; instead it gets invoked in a pretend testing environment. The mock
function is to generate pretend output so we can check whether the python code is doing what it's supposed to (this gets generated in lieu of actual output).
@subprocess_patch
in this case patches the calls to check_output
made by the builder and replaces them with mock calls.
mock_check_output
produces mock output based on a side_effect
; these are program-specific and defined in _side_effects.py
.
The rest of the code is meant to check a particular aspect of the builder works as expected. In this case, standard_test
is run, which is defined in _test_helpers.py
. This calls the builder, redirects the log file, and runs the side-effect.
Finally, we check what we want. In this case, we check the target file was created (this happens in the side-effect, which parses the function call using a pre-defined regex; see commandd_match()
for details).
These should be resolved before coding any new tests:
At the moment the standard tests call system_mock.assert_called_once()
to check that the builder was only called once (since this is standard for all programs except LaTeX). We need to decide whether to make a LaTeX-specific exception or collapse the three LaTeX calls to a single check_output
call.
system_mock
assert it was called exactly thrice for .tex
files. Unsure how to do this but I assume it is possible. LMK if you do not find it in the docs.check_output
call for LaTeX with os-specific switches, the main drawback is that we'd have to modify the LaTeX regex, which I don't think is a good idea. This would be the case even if we found an OS-independent way to do it I think (unless check_output
has a "repeat command" switch),BibTeX cannot be checked with the current LeTeX tests for two reasons:
check_output
would be called four times instead of three with BibTeX.This is my own take:
Figure out how to make the LaTeX tests pass. This should require only adding the switch specific to LaTeX
to assert three calls to check_output
.
Decide how to code the BibTeX tests. There are two choices:
My vote is for the latter. In either case we need to code a BibTeX side-effect. For this step, it will be sufficient to code an empty BibTeX side-effeect and add logic in the LaTeX side-effect to switch to BibTeX when this is called instead of LaTeX. LMK if you have trouble doing this (it's fine for the side-effect to be empty since none of the LaTeX tests should call it).
Once that is done, if both are correct, we can:
.blg
and .bbl
file based on the BibTeX command.commandd_match
.check_output
is asserted to have been called 4 times (not 3).I think this will be plenty for this issue. You can add a test_bibtex_basic
function to the LaTeX tests and add a .bib
file to the dependenties, which should activate the BibTeX
. This tests:
.bib
is correct (the rest of the LaTeX tests would do this)..bib
is correct (this new BibTeX tests would do this).The other two tests to add would be:
.bib
files.But I think 1-3 above is already plenty for this PR.
@mcaceresb Thank you for the incredibly useful summary above. I tried my hand at implementing the first step so that it asserts 3 calls if latex is specified. Any feedback would be appreciated.
@mcaceresb Thank you for the incredibly useful summary above. I tried my hand at implementing the first step so that it asserts 3 calls if latex is specified. Any feedback would be appreciated.
@rcalvo12 Nice! The only issue that remained was that LaTeX cleanup is invoked on failure and expects add_out_name
to have been run, so I switched it to L32 (above execute_system_call
). You can see all the tests are now passing. This should always be the case now (you can run tests locally using pytest
).
Happy to continue to help as you make progress on coding the BibTeX test.
@mcaceresb Thanks again for your help here! I'm trying to work on this step now:
In either case we need to code a BibTeX side-effect. For this step, it will be sufficient to code an empty BibTeX side-effect and add logic in the LaTeX side-effect to switch to BibTeX when this is called instead of LaTeX. LMK if you have trouble doing this (it's fine for the side-effect to be empty since none of the LaTeX tests should call it).
I was able to add an empty side-effect but I am unclear on how to add the logic within the LaTeX side effect for switching. Maybe you can expand on this a bit more?
I was able to add an empty side-effect but I am unclear on how to add the logic within the LaTeX side effect for switching. Maybe you can expand on this a bit more?
@rcalvo12 You can write a bibtex regex in command_match()
, then do something like
bibmatch = helpers.command_match(command, 'bibtex')
if bibmatch:
bibtex_side_effect(*args, **kwargs)
return
match = helpers.command_match(command, 'pdflatex')
# rest of LaTeX side-effect
Happy to also give guidance on writing the regex if you need it! LMK.
@mcaceresb I tried my hand at what you suggested here, would appreciate feedback. Thanks!
@rcalvo12 One excellent resource is regex101. You can input your regex and sample text to see if it matches. If you select python
as the flavor, you can see
(?P<executable>\w+)\s+(?P<target>-\w+\s+\S+)?\s*(?P<log_redirect>\>\s*[\.\/\\\w]+\.\w+)?
does not quite partition
bibtex target_file > sconscript.log
correctly. While the expression matches, the groups are not divided in the way we want. You can see the issue is with target
, which was based on a regex for an option rather than a source or target. (Note other options for sources/targets assume a file extension.) You can play around with the website, or see what worked for me:
PS: I do realize I'm giving you feedback in a very piece-meal way. I thought it would be helpful but LMK in case you want me to more actively help in finishing to code this: I don't want to prompt you to spend too much time on it if it's not helpful for you.
@mcaceresb After spending some time on this issue over the past handful of weeks, I am not much closer on the implementation of the bibtex test and I think it would probably be more efficient to bring you in on the implementation. Maybe we can set up a call to discuss, so I can learn from you more directly on this.
@rcalvo12 Messaged you; we'll set something up and hopefully close this PR.
@jmshapir fyi
@rcalvo12 Following our meeting, I pushed the changes we discussed implementing:
nsyscalls
argument to standard_test
to account for LaTeX and bibtex
running the builder multiple times.check_bib
function works when .bib
files are specified.The automated python tests are all passing and I've approved the PR.
@jmshapir fyi
Thanks @mcaceresb for that very productive meeting! In https://github.com/JMSLab/Template/pull/56/commits/58010a8070f96f59c224c82cb75b2f57f39fc443, I reran the code and tests on my end and I also added latex to the list of envs in Sconstruct
. Everything appears to be working as intended.
@jmshapir fyi.
@veli-m-andirin Did you have any other changes here? If not can you approve?
@rcalvo12, feel free to ignore the comment about the pdf for talk, it seems to be done on purpose!
@rcalvo12, feel free to ignore the comment about the pdf for talk, it seems to be done on purpose!
Thanks @veli-m-andirin! See https://github.com/JMSLab/Template/issues/66#issue-1378677265 for context.
@jmshapir and @veli-m-andirin could you review this PR when you get a chance?
In this issue, we made changes to improve the LaTeX builder so that it could produce papers from LaTeX files that use BibTeX for references.
Changes:
build_latex.py
Template.tex
andReferences.bib
to source/papergdp_educ.tex
andtop_gdp.tex
to be referenced by TemplateTemplate.lyx
to useReferences.bib
for referencessource/paper/sconscript
For this PR, I think it would work if:
test_build_latex.py
does not need edits@mcaceresb, if you want to review as well, I can add you as a reviewer.