Try to break the template

gentzkow commented 7 months ago

Goals:

Identify as many cases as possible where a user would encounter errors / bugs cloning and running the current version of the template
Eliminate errors where possible
Add friendly error handling where that's not possible
Keep track of the things we tried and use them to produce a list of "unit tests" that we can use for subsequent testing of the template

An initial list of things to try; you should add to this as many other things as you can think of.

Running on Windows
Running on Linux
Running on a clean machine that does not have the correct R/Python packages installed
Running on a machine that does not have R/Python/Stata installed but trying to run a make.sh script that calls an R/Python/Stata sciprt
Running a make.sh script without cd-ing into the correct directory (e.g., bash /mydir/template/1_data/make.sh)
Running the template without running setup.sh first

@snairdesai @jc-cisneros @ShiqiYang2022 I'm assigning this to @shrishj, but I'd like you all to help out as well. In particular, you should figure out what's the easiest way to get him access to a Windows machine. For Linux testing, we could try using Github codespaces? Or we could do it on a Mac creating clean environments w/ Conda?

jc-cisneros commented 7 months ago

@gentzkow, some comments on testing on Linux/Windows:

Linux: My sense is that Linux does not have any particularities regarding hardware (e.g., the M1/M2 chip on a Mac) and is the predominant language on virtual machines. TTesting it on Codespaces or a custom Docker container would suffice. A GitHub codespace is indeed a virtual machine running Linux in a Docker container hosted by GitHub (I think the default is Linux Ubuntu, but this can be changed). Sherlock's OS is Linux CentOS, which can also be used for testing.
Windows: Although we could build a Docker container that runs Windows OS, we want to test it on an actual Windows machine. In the current version of the repository, template is compatible with Windows through its built-in Linux virtual machine (Windows Subsystem for Linux). Unfortunately, Stanford's libraries and computer clusters only have machines that run Mac OS (see here). My suggestion would be to reach out to The Tech Desk and check if they have a Windows machine @shrishj's could check out for a couple of weeks. We cannot use their Web Checkout portal directly because they only offer MacBooks and Chromebooks (the latter runs on Linux-based ChromeOS). If that is not possible, both @ShiqiYang2022 and I have personal laptops that run on Windows.

ShiqiYang2022 commented 7 months ago

Thanks @gentzkow and thanks @jc-cisneros for very thoughtful comments! I have few points to throw in:

For Windows, we might consider utilizing remote desktop solutions or cloud services to access a Windows environment. E.g. use Microsoft Azure or Amazon EC2 for testing purposes.

Compared to Docker, it uses a complete Windows operating system, running on physical hardware or cloud infrastructure; while Docker are a lightweight solution, running atop the host's system and sharing its kernel.
Compared to using a real windows machine, it might be more cost-effective, especially when there is a need to test across systems with multiple different configurations(our Template).

I am more than happy to test the Template on my personal Windows machine. I think SIEPR might also have some idle windows machines because I heard one predoc has asked Lisa for a new windows machine to replace his old one. I will reach out to Lisa for more details next week.

gentzkow commented 7 months ago

Thanks! I think a Widows VM and/or a physical machine at SIEPR could be fine. A big issue will be whether Git Bash or some other emulator is enough to run the shell scripts on Windows without having to use a Linux virtual machine.

jc-cisneros commented 7 months ago

@shrishj @gentzkow letting you guys know that we got a Windows PC for two months. It is available at the lab at SIEPR.

shrishj commented 7 months ago

Thanks JC! I will spend some time doing testing from my laptop and once exhausted for Mac, I will come to use the Windows PC.

shrishj commented 7 months ago

Hi @gentzkow! Here are a few errors that a user may encounter when using template. I have added in ideas for how we can improve error handling. I have been having trouble with my conda environment when I was testing today, so I am unable to do try the wrong version of Python and even check what would happen if its missing entirely. I will visit the team this week to get some help. I will then look into using the Windows PC.

When running ‘bash make.sh’ before running ‘bash setup.sh’ we see the following error message:

Making template with shell: /bin/bash
make.sh: line 10: ../local_env.sh: No such file or directory
Improvement: Perhaps, when we are faced with ‘local_env.sh’ as a missing file, we output to the user that we need them to run ‘setup.sh’ for full clarity.

When we try to run the program from the wrong directory we see the following error message:

bash: run_all.sh: No such file or directory

When we remove one of the modules (3_slides) we see the following error message:

run_all.sh: line 13: cd: 3_slides: No such file or directory
Improvement: Perhaps, we could formally flag to the user that they are missing a major module.

When we remove a ‘make.sh’ file from one module we see the following error message:

Making template with shell: /bin/bash
/bin/bash: make.sh: No such file or directory
Improvement: It can be difficult to know which folder is missing the ‘make.sh’ file as it is buried in a module. Perhaps, we can signal to the user that they need to check the specific module for the missing file.

gentzkow commented 7 months ago

@shrishj Thanks!

The first of these is something I think we should definitely fix as you suggest. I think anywhere where we call local_env.sh we should have this kind of error handling. The error messages should say that local_env.sh is missing and then recommend that the user run setup.sh to fix.

The others look to me like cases that will not happen as often, but still worth addressing if we can.

A general question for us to answer is how to add error handling without cluttering up the make.sh scripts. Ideally, all of this should live within the scripts in lib/shell. Maybe we create a new tool called check_setup.sh that runs a bunch of checks including looking for local_env.sh? We could call this at the top of each make.sh. We can also add extra error handling within the run_XXX commands.

shrishj commented 7 months ago

Hi @gentzkow. I have made a commit to the development branch with a new check_setup.sh file that houses the error handling. We then source check_setup.sh in the run_all.sh file to ensure that everything is in order prior to running the template.

General Updates:

I finally managed to get my Conda environment working after help from the team, so I can test older Python packages! I tested template using Python 2.7 and it seemed to handle it without throwing any errors.
I also had a few errors on my side as pandas, matplotlib and statsmodels were not already installed. I thought it would be useful to include this in our check_setup.sh file. If these modules are not installed, check_setup.sh uses pip install to fix the issue.
Currently, check_setup.sh ensures that bash setup.sh is run and that all appropriate modules are imported. The only issue is that template no longer works for Python 2.7 post my updates. I will investigate further to see if I can figure it out. However, from initial glance, it looks like the modules we are installing are not compatible with such old Python version. Then again, it seems unlikely for a user to be using such an old version.

cc @jc-cisneros @ShiqiYang2022 @snairdesai

gentzkow commented 7 months ago

@shrishj Thanks.

We want to anticipate that most users will be running individual modules using the make.sh scripts. The run_all.sh script is actually run pretty rarely in practice, because for large projects it requires a lot of time. So we need to execute check_setup.sh in each make.sh script as well.
It's a bigger to do to think about how we want to handle package dependencies. For real projects, the set of dependencies will evolve a lot over time and we need a robust way to make sure that they're tracked correctly. We will want users to have the option of doing installation in a way that doesn't affect their local environment (e.g., via Conda). I think we should bracket all this for now and remove the module checking + pip install commands from check_setup.sh.

shrishj commented 7 months ago

Hi @gentzkow. Thanks for the feedback! I have just committed to our development branch making the changes (sourcing check_setup.sh in the make.sh files and removing the pip install commands). I will speak to the team regarding how we can go about handling the package dependencies and comment a proposal soon. I will also continue testing to see if I notice anything else. Thanks!

gentzkow commented 7 months ago

Thanks @shrishj. I think we should defer package dependencies to a subsequent issue. Let's keep this one scoped to only what's in the issue description here.

Let's do the following:

check_setup.sh should live in lib/shell/
check_setup.sh should work no matter where we call it from, so we can't assume that local_env.sh is at ../. We should pass ${REPO_ROOT} as an argument to check_setup.sh and then look for the file at ${REPO_ROOT}/local_env.sh.
Similarly, make.sh files should work no matter where they are in the file structure provided ${REPO_ROOT} is set correctly. So the call to check_setup.sh should come after line 10 of make.sh and should call ${REPO_ROOT}/lib/shell/check_setup.sh not ../check_setup.sh.
Let's make the error message

The file local_env.sh was not found at the root of the repository. This is a local settings file that must exist in order for make.sh to run. Please run the script setup.sh at the root of the repository to create local_env.sh and set up the local environment.

One other note: I realize that we've left out the files .gitignore and .gitattributes from the template so far. These should live at the root of the repo. The .gitignore file defines files that don't get committed to git, which should include local_env.sh as well as .DS_store. Please add the two files in the zip archive below to the root of the repo and delete the local_env.sh and .DS_store files that have been committed by mistake (check all subdirectories).

Archive.zip

Next steps on this issue include:

Test on Windows
Add error handling to the run_xxx.sh commands for the case where the program in question (Stata, R, Python, etc.) is not installed or the executable is not on the path.

Let me know when you think you can complete this.

@snairdesai @jc-cisneros @ShiqiYang2022 Can you guys also do some testing here to see if you can uncover additional errors a user might run into and/or suggest more things for @shrishj to try?

shrishj commented 7 months ago

Hi! Thanks for the feedback! I have made changes 1-4 locally.

When downloading Archive.zip, it was showing me an empty folder (not sure if there is an issue on my side). I wanted to check if the Archive.zip was meant to contain the .gitignore and .gitattributes. For now, I have created my own .gitignore and .gitattributes. In .gitignore, I only ignore local_env.sh and .DS_store. In gitattributes I copied over what was in template. I have deleted the necessary files too. Before committing, I wanted to check that everything was in order. Thank you!

gentzkow commented 7 months ago

Huh. I think .gitignore and .gitattributes are hidden files, so you need to enable viewing them? (E.g., command-shift-P on a Mac)

Here is the content I had.

.gitattributes

*.log merge=binary

.gitignore

# `local_env.sh` is ignored as it contains user-specific settings
*local_env.sh

# `/temp/` subdirectories are ignored as they are temporary by design
*temp/

# `/output_local/` subdirectories are ignored as they contain large outputs that will be stored outside of Github
*output_local/

# The following extensions indicated intermediate files generated by certain programs during compiling
*.pyc
*.lyx~
*.lyx#
*.lyx.emergency
*.Rhistory
*.Rapp.history
*Rplots.pdf
*.DS_Store
*.aux
*.fls
*.lof
*.lot
*.nav
*.snm
*.toc
*.out
*.svn
*.ipynb_checkpoints/
*setup_stata.log

# Personal conda info
.conda_info

shrishj commented 7 months ago

Thanks @gentzkow - it seemed to be hidden from my view. I have made the changes detailed in the previous message and will now look at adding error handling to the run_xxx.sh commands when the program in question (Stata, R, Python, etc.) is not installed or the executable is not on the path.

gentzkow commented 7 months ago

Thanks!

ShiqiYang2022 commented 7 months ago

@shrishj

I had some time to test the template on my new initialized branch and attached are my feedback:

I am highlighting package issue aside our slack conversation in case you need -- our template currently do not have a environment, per comment and comment, I think we need to create environment separately. An error throw out if we do not use old template env in e00dd35. Per comment, we will defer this to an separate issue, @shrishj please feel free to open the follow-up when you have bandwidth.
check_setup.sh cannot print my REPO_ROOT, because this is not pre-defined. I addressed this in my branch. See here.

``` (template) SIEPR-C02G50GUML86:GentzkowLabTemplate shiqiyang$ cd ./lib/shell/ (template) SIEPR-C02G50GUML86:shell shiqiyang$ bash check_setup.sh The file local_env.sh was not found at the root of the repository. This is a local settings file that must exist in order for make.sh to run. Please run the script setup.sh at the root of the repository to create local_env.sh and set up the local environment. ```

@gentzkow Do we want users to run make.sh from any places it was called? If so, I suggest to define REPO_ROOT in the absolute path format in make.sh. @shrishj You can see what I mean here.
Some codes to output message, e.g. https://github.com/gentzkow/GentzkowLabTemplate/blob/86e43c070649807545528448a6940679a2271dcd/1_data/make.sh#L30 need to be adjusted to echo -e ..., to enable echo command to interpret certain escape sequences within the string. I fixed in my own branch here.
I think the input for /3_slides/ should live under /3_slides/input/ instead of 3_slides/source/input; currently there are two places to store the /input/. Also we do not need /3_slides/input/mpg.csv/.
The problem same as 5 in /4_paper/ on repo structure.
In run_R.sh https://github.com/gentzkow/GentzkowLabTemplate/blob/06b42fb8e3d4423a78613043afca120fcaee462f/lib/shell/run_R.sh#L17, we need to use >> instead of >, as > means overwrite, while >> means append. I fixed in 0108edc.

I addressed a subset of those bullets on my own branch issue3-shiqi-testing, @shrishj if it looks good to you, feel free to merge my branch to the issue branch 3-try-to-break-the-template in order to save your time. I will test the template on my Windows system when I have further bandwidth.

gentzkow commented 7 months ago

Thanks @ShiqiYang2022.

On (3), I'm assuming make.sh will always be called from the directory it lives in. I.e., bash make.sh not bash 2_analysis/make.sh. I'm not sure how getting the absolute path as here would fix this -- won't it still break if I call it from a different directory because $REPO_ROOT_RELATIVE will then be incorrect?

Other comments sound good to me.

ShiqiYang2022 commented 7 months ago

@gentzkow Thanks!

I'm not sure how getting the absolute path as here would fix this -- won't it still break if I call it from a different directory because $REPO_ROOT_RELATIVE will then be incorrect?

You are right, sorry for the mistake. What I have previously in mind is we first got the absolute path where the script is, and then use that path to define the repo path. I fixed in 2bd74ad, the example for check_setup.sh see here.

https://github.com/gentzkow/GentzkowLabTemplate/blob/2bd74ad408206389a602311d012bc3fa2a786b00/lib/shell/check_setup.sh#L5-L8

Nevertheless, if we anticipate user always call make.sh from the directory it lives in, then we do not need to define the absolute path. But anyway thanks for the great catch!

gentzkow commented 7 months ago

Ooh, that's a nice solution! If we can make things robust to the user calling make.sh from anywhere that will be a big improvement. In that case we might be able to get rid of the cd commands in run_all.sh?

Let's test this solution carefully to make sure it works robustly.

gentzkow commented 7 months ago

Another thing to think about would be how we can trap errors where the relative path in

REPO_ROOT=$(realpath "$SCRIPT_ABSOLUTE_PATH/../../../")

is wrong and so we're not pointing to the actual root of the repository. Maybe have a step in check_setup.sh where we confirm that run_all.sh exists in ${REPO_ROOT} and issue a warning if it does not? If we did this we'd want to give the user instructions or turning off the warning and/or modifying the file it checks for, since there may be cases where someone intentionally deletes or renames run_all.sh.

ShiqiYang2022 commented 7 months ago

That's great! I agree with getting rid of cd commands. Will make the team conscious on testing the robustness on this solution. I do not think we have steps to confirm run_all.sh exist, we need to implement. I am also aligned with adding error shooting and checking messages to navigate users to fix.

shrishj commented 6 months ago

Apologies for the delay. Thanks so much for your changes @ShiqiYang2022 and @gentzkow. I have merged Shiqi's development branch into our original branch. I have not deleted Shiqi's branch - I am happy to do it once Shiqi gives me the go-ahead.

I am adding a check for confirming if run_all.sh exist in the ${REPO_ROOT}. In terms of the error messaging, should I tell users how to toggle off this check by going into check_setup.sh and changing some arbitrary boolean value?

gentzkow commented 6 months ago

Thanks @shrishj. Lets add the warning when run_all.sh doesn't exist but not worry about giving the user the option to change it for now. Also, let's make sure this is a warning message not an error. We can just say:

The file run_all.sh does not exist in the directory currently specified as the repository root. This may mean that the variable REPO_ROOT in make.sh is set incorrectly. The variable REPO_ROOT should be a relative path pointing to the top level of the repository.

ShiqiYang2022 commented 6 months ago

@shrishj

I have merged Shiqi's development branch into our original branch. I have not deleted Shiqi's branch - I am happy to do it once Shiqi gives me the go-ahead.

Awesome! You could go ahead and delete my test branch.

shrishj commented 6 months ago

Hi @gentzkow. I have pushed a commit to the development branch with a warning message for run_all.sh and we still allow the program to run. I also made a change to the $REPO_ROOT variable in check_setup.sh as the variable was pointing to the parent directory instead of the required subdirectory. It is passing all of my tests so far. Thanks for the support!

gentzkow commented 6 months ago

@ShiqiYang2022 @shrishj I have reviewed these changes. I spent some time making additional updates -- revising the structure of the make.sh scripts and the run_XXX scripts to correct some errors and make them simpler / more robust.

I think we are good on the issues discussed so far.

Please do another quick round of testing to make sure my changes haven't introduced any errors. The remaining item is to test on Windows.

@shrishj I would like to accelerate the pace here. Do you think you'll be able to work the ~8 hrs per week we had agreed to once spring quarter starts?

@ShiqiYang2022 Do you think you might be able to jump in and do the Windows testing?

ShiqiYang2022 commented 6 months ago

@gentzkow Thanks! Happy to speed up with full gear on this.

I am more than happy to do the windows test -- I am wrapping up what I have on my plate in SearchMarket project and conduct the testing today.

gentzkow commented 6 months ago

Great. Thanks! You should keep prioritizing SearchMarket ahead of this, but if you have a chance to do the Windows test that would be great.

shrishj commented 6 months ago

Hi @gentzkow, very sorry for the delay. I have just finished my exams and will be working on both issues in the coming days. I will post some updates regarding testing and the work on this issue.

ShiqiYang2022 commented 6 months ago

I am providing feedback on my Windows machine. Unfortunately, there are some errors, so this is a temporary post as I have not yet successfully run the template from my Windows machine. I will keep this post modified and updated.

Again, our current template does not have a corresponding environment. We need to create one; a (hot + easy) fix is to use what we have in our old template. But I vote to design one for this template. This need to be done prior to merging back to master.
Line break issue. The error messages indicate that the all .sh script, when cloning and checking out, the line ending has Windows-style (\r\n), whereas Linux and UNIX systems expect UNIX-style line endings (\n). This discrepancy leads to errors such as "$'\r': command not found". Unlike Shell scripts, R and Python interpreters (which are used in our old template) handle line endings can correctly identify line ends with either LF or CRLF, thus avoiding execution issues caused by line ending differences.
- @shrishj Have you observed the same issue when testing Windows on your end? If not, then probably there's some issues do with my personal machine.
- Git provides its official guidance in handling line ending issues. I tried to change global settings and .gitattributes as suggested, however, those did not work on my end. I think if necessary, I need to refresh the branch as suggested to normalize the line endings.
- I also think maybe it's worth doing line break checking and fixes in setup.sh, I implemented a fix and auto-check in fd61fc9.
The issue of correct passing the entire file path. This is related to what we discussed in #85 comment and #85 comment. In the shell, we need to quote the whole path to ensure it is passed correctly. I implemented the fix in f101660.
The latex compile issue. I think this is because some configuration issues with my machine for latex. I am fixing this on my own end, so leaving a placeholder here. Will update this after I identified this issue clearly.

gentzkow commented 6 months ago

Thanks @ShiqiYang2022.

Can you remind me what this means?

Again, our current template does not have a corresponding environment. We need to create one; a (hot + easy) fix is to use what we have in our old template.

Can you also give a little more detail on what the line ending issue is? Is it only an issue for the shell scripts? Or does it mean all files committed to Git from Windows are going to have the wrong line endings for Mac and vice versa?

Let's not make any changes to the template yet unless they are things we'd want to do even if we weren't trying to have it run on Windows. I want to get a sense of the issues before deciding whether we want to make it cross-platform compatible.

ShiqiYang2022 commented 6 months ago

Thanks @gentzkow! Re your https://github.com/gentzkow/GentzkowLabTemplate/issues/3#issuecomment-2010267938:

Can you remind me what this means?

Again, our current template does not have a corresponding environment. We need to create one; a (hot + easy) fix is to use what we have in our old template.

I think the way we run the template on Windows is, we need to use a conda environment, as we decided in https://github.com/gentzkow/template/issues/95#issue-2003434454 and https://github.com/gentzkow/GentzkowLabTemplate/issues/3#issuecomment-1948934677, to handle package dependencies.

For the line ending issue, sorry, I should be more articulated. It happened when I cloned the repository and shifted to the issue branch on my windows machine. Then I tried to run run_all.sh from command line but it broke into error as follows:

``` (template) shiqiy@ShiqiYang:/mnt/c/Users/Shiqi Yang/Desktop/GSLab/GentzkowLabTemplate$ bash run_all.sh : invalid option 2: set: - set: usage: set [-abefhkmnptuvxBCHP] [-o option-name] [--] [arg ...] run_all.sh: line 3: $'\r': command not found run_all.sh: line 6: $'\r': command not found with shell: /bin/bash run_all.sh: line 9: $'\r': command not found : No such file or directory : No such file or directorysh : No such file or directory ``` : No such file or directory

This is due to the line endings from my machine has Windows-style (\r\n), whereas Linux and UNIX systems (here is WSL(Windows Subsystem for Linux)) expect UNIX-style line endings (\n).

Is it only an issue for the shell scripts?

Currently, I would say yes, because after I fix the line ending issue for all shell scripts, I can run /1_data/ and /2_analysis/ successfully.

Or does it mean all files committed to Git from Windows are going to have the wrong line endings for Mac and vice versa?

When committing or checking out, git will not change the line endings of the scripts. It will remain as what it is when commited unless specified core.autocrlf = true.

Usually, the files we committed from Windows would not have such error when executed, because we call our scripts in python or R in most of our projects, and those interpreters can interpret the line endings no matter what type of line endings it is, and will not have issues in executing.

But, if we call our shell scripts using bash interpreter, bash are sensitive to the line endings. So when a shell script is called with CRLF (\r\n) line endings in Windows, Unix/Linux Shell interpreters (WSL) treat the CR (\r) as part of the command, leading to execution errors.

I hope it clarifies!

gentzkow commented 6 months ago

OK. Thanks.

On the Conda issue, I would have hoped we could be agnostic about how people are installing Python and R in this template, just as we are on Macs.

On the line ending issue, what are you using as the bash emulator on Windows? My thought had been we might start by trying Git Bash.

ShiqiYang2022 commented 6 months ago

Understood, thanks!

I am using Windows Subsystem for Linux (WSL) on my end, I started using this as our lab used WSL for APD replication. I will also try using Git Bash!

gentzkow commented 6 months ago

Thanks @ShiqiYang2022. Interested to see whether it's any different in Git Bash.

I guess for an individual shell file another option would be to run it with something like

sed 's/\r$//' run_all.sh | bash

?

If that worked I guess Windows users could create a custom command line tool to do that (e.g., "wbash") and then we could set that to be the ${SHELL} executable.

Another option would be to have a git hook that converts all *.sh files when they are checked out. But in that case would a Windows user no longer be able to read / edit them in a normal text editor?

gentzkow commented 6 months ago

@ShiqiYang2022

I was reading more about the Git line ending normalizations here. Would another option be to add

*.sh -text

to .gitattributes so that the shell files retain their lf line endings?

Also: We should probably open a separate issue for the Windows testing and wrap up the current issue.

ShiqiYang2022 commented 6 months ago

Thanks @gentzkow!

sed 's/\r$//' run_all.sh | bash

Totally, this will work, yesterday in https://github.com/gentzkow/GentzkowLabTemplate/issues/3#issuecomment-2008626046 I implement a similar fix using dos2unix, see https://github.com/gentzkow/GentzkowLabTemplate/commit/fd61fc98ea2c0abdd91939296ca2959d2e9e9226.

Using custom command line tool sounds like a applicable plan, will test on my end. Not sure whether this is necessary to use git hook, but I guess we could add similar conversion into our setup.sh file? (see here)

I was reading more about the Git line ending normalizations here. Would another option be to add

*.sh -text

to .gitattributes so that the shell files retain their lf line endings?

Totally, I agree. Yesterday I implemented similar fixes in d121d0b to automatically fit the line endings across operating systems. Somehow it does not work on my end unfortunately. I am wondering whether this is because I implemented on the branch instead of main. But I will investigate further, and also test the way to edit .gitattributes as you suggested!

Opening a separate issue for the Windows testing sounds good to me! Will wrap up this with a PR soon.

ShiqiYang2022 commented 6 months ago

Threads continues in its Pull Request #7.

ShiqiYang2022 commented 5 months ago

Summary

In this issue we conducted a test run on Mac to identify potential problems in our current GentzkowLabTemplate.

Merged into master in 1f45308. Final state of the issue branch here.

gentzkow / GentzkowLabTemplate

Try to break the template #3