Closed ShiqiYang2022 closed 1 year ago
Hi Professor @gentzkow! @snairdesai, @arjunsrini, @jc-cisneros, @ShiqiYang2022 and I had a meeting on Friday to discuss what was missing in "shell make" and compiled proposed next steps for your review. We collaborated using this Google Doc.
Add functionality for other programs (presently only available for Stata and Shell files)
Convert Makefiles to make.sh
Creating/filling tables
Copy inputs and externals
Config
which executablename
) to ensure all relevant executables are installed, make config file work for standard yaml formatUser interface
@shrishj Thanks!
Let me jump in to make some comments:
I think our approach here should be to converge on a minimal implementation of the shell template that we are happy with. To that end, I'd suggest we just aim to have a shell that can run Stata, Python, and R and also compile Latex. We can skip the tablefill functionality, creating links, and config steps for now.
Before you do a ton of work implementing this, I think we should all look together at a "rough draft" of it to make sure we're all on the same page about what we want to implement. I'm happy to do that however seems most efficient -- could be a first-cut implementation in actual code, or a mockup of it in pseudocode.
@snairdesai @arjunsrini @jc-cisneros @ShiqiYang2022: I'd appreciate your eyes on this too. Remember that the goal is not to do a 1:1 translation of our current template into shell. The goal is to simplify as much as possible along the way, and to implement the changes we discussed at our meeting.
I’ve updated TunaTemplate to include a "rough draft" working stencil. It runs the same python scripts as normal template and compiles a similar tex file. Basic versions of run Stata, Python, R, and Latex are implemented.
@snairdesai @jc-cisneros @ShiqiYang2022 @shrishj let me know if this looks right to y'all; perhaps @shrishj can clone the repo and try to replicate it. There are instruction for how to do this in the TunaTemplate readme.
The run_program
shell functions are defined here. I’ve included the terminal output of my cloning/running of TunaTemplate below.
@arjunsrini Thanks arjun! I think your proposed run_program
function looks great!
I played with the Template
, and I just found that I cannot run it without loading the gentzkow/template
environment, my error message is as follows.
@ShiqiYang2022 Thanks for testing it!
The error you ran into arises because in your default shell environment (before activating Conda), the PATH
environment variable does not include a directory that contains a python
executable. When you run which python
, it finds no matching executable in any of the directories listed in your PATH
, resulting in no output.
When you activate a Conda environment, several environment variables, including PATH
, are temporarily modified for your current shell session. Conda prepends its own directories to the PATH
. These directories contain executables for Python and other tools installed in the Conda environment.
My prior implementation of run_python
assumes that your terminal recognizes the command python
. I’ve now added functionality to parse a config.yaml file for a python command which is then used to run python programs (so you could specify python3
as your python command in a config.yaml
file at the root of your project if you wanted). But I think it is ok for us to by default assume that if someone includes a python program, the command they’d like to execute it with is python
.
Let me know if others have comments or run into bugs @snairdesai @jc-cisneros @shrishj
Hi @arjunsrini! Thanks for clarification - I encountered a similar issue to Shiqi. I have copied in the .yaml file, but do I need to change the pathToRepo and the pathToDb values too so that the program runs correctly. I seem to be getting an error with pandas installation. I tried the pip3 install pandas, but it didn't work. Thank you for the help!
@shrishj thanks for testing it!
Re: config — you can remove the other lines and it should work. For more context, see how the config functionality is implemented here.
Re: pandas error — is this unique to your usage of template? If you are using python
as your python command (the default if you don’t use config.yaml
), I’d try pip install pandas
(not pip3
). Or alternatively, set pythonCmd: “python3”
in your config.yaml
.
If that doesn’t work, can you load pandas in either of these ways?
If you are using pip3
(package manager for Python 3) to install pandas
, it will be installed in your Python 3 environment. You should check that your python
command is linked to Python 3 (try python --version
) and is associated with the same environment as your pip3
. You can confirm this with the commands which pip3
, which python3
, which python
, etc.
If installing pandas is giving you a lot of trouble feel free to put the errors you’re getting in Slack.
Hi @arjunsrini! Thanks so much for your quick reply! It is now working and creating ~/paper_slides/output/paper.pdf. In addition to pandas, I also needed to use pip3 install for matplotlib and linearmodels.
@arjunsrini: Confirming this also works on my end after the relevant installs, great work (it's also noticeably faster than standard template
, as we expect)! Some notes from our in-person discussion:
conda
).I’ve updated TunaTemplate to include automated python virtual environment (venv
) functionality using a requirements.txt
file and shell functions to create and activate the environment. I also have the program quit on error and improved the log file readability for compiling latex.
cc @snairdesai @jc-cisneros @ShiqiYang2022 @shrishj
@arjunsrini confirming this works on my end! Thanks for the work! As a minor comment, we probably want to add the venv
directory to the .gitignore.
@gentzkow you can test it on your end with these steps:
1) Run git clone git@github.com:arjunsrini/TunaTemplate.git
and cd TunaTemplate
2) Load the submodule with git submodule init
and git submodule update
3) Run the "make" shell script at the root of the repo with bash make.sh
@shrishj @arjunsrini @snairdesai @ShiqiYang2022 @jc-cisneros Thanks
Quick questions
My broader comment is that I'd like to stop at this point to take stock and think about potential ways to simplify. I'd like to discuss, e.g.,
make_lib.sh
layer? Could we replace config yaml altogether with just a config .sh
script?make_externals.sh
layer? Could we replace this with just defining paths to the externals in the config file and then referring to those paths directly in the downstream scripts? (To do this we just need a way to pass those global variables forward to R/Python/Stata.)run_programs_in_order
as a single command vs. calling individual run_stata
, run_python
, etc. commands in make.sh
, or even calling the shell commands to run these scripts directly in make.sh
?paper_slides
. The "For now: manually copy" and "For now: manually move" steps are obviously clunky.I'd suggest we wrap this issue and open a new one to discuss these things.
In this issue (https://github.com/gentzkow/template/issues/94) we got @shrishj get familiar with gentzkow template, gslab_make
library and @arjunsrini's existing template constructed based on shell functions, and the corresponding shmake
.
The deliverable is the compiled list of functionalities in GSLab Functionality.pdf, and proposed next steps to do in this comment. In this issue @arjunsrini also conducted a first implement of the functionalities, details referred to https://github.com/gentzkow/template/issues/94#issuecomment-1809222181 https://github.com/gentzkow/template/issues/94#issuecomment-1811555092.
This thread continued in #95 for the follow up discussion of https://github.com/gentzkow/template/issues/94#issuecomment-1819871703.
The goal of this issue is to draft a list of functionalities (e.g., moving files across modules or logging runs) that template/gslab_make and Arjun's shell template solution currently have. To do this, @snairdesai @jc-cisneros and I will help @shrishj get familiar with:
gslab_make
librarywhile @arjunsrini Arjun will help (thanks for helping here!) @shrishj with
"shell make"
.The deliverable is a list of functionalities that we have in template/gslab_make but not in Arjun's shell template and vice versa. @shrishj, please reach out to us at any time if you have questions, thanks!