In this practical, you will perform steps needed on working environment setup for reproducible data analysis using the code versioning system git
, systems environment management renv
and RMarkdown/Quarto
to create reproducible documents.
After a while, you should see Rstudio IDE environment with your project.
In the following window under Git/SVN for version control system select Git and save the change.
If you are using RStudio and you do not have git as an option for code versioning you need to install it on your machine following next steps. Posit Cloud users skip the installation part and please go to the git configuration steps.
For Windows
Please download Git Bash from Git download.
For macOS
Please install it (recommended) following instructions here: http://git-scm.com/downloads.
For GNU/Linux
Please run in the terminal:
sudo apt-get install git
To configure git in all machines and for both RStudio and Posit Cloud users, fill in the mandatory info.
Type in the terminal/Bash (update name/email):
git config --global user.name "Firstname Lastname"
git config --global user.email "yourEmail@server.com"
Check the configuration was successful by running:
git config --list
You should be able to see your user.name and user.email set accordingly.
Your data analysis will require multiple packages. To use renv
first run install.packages("renv")
in the console. To start collecting the list of used packages in the project library initialize renv
by running renv::init()
. Inspect the renv.lock file.
The project environment and all used packages will be installed in any other system by running the renv::restore()
in the console tab.
In your project directory make 2 directories called data and R.
Data directory is where your ChiP-seq data from previous practicals should be placed.
Download the ChiP-seq data into the data directory and name it TC1-ST2-D0.12_peaks.narrowPeak. The location and the name of your data file are important for the analysis-code.R to work!
R directory is where you should create a new R Script and copy the code from the analysis-code.R from the given repository.
Now that you have the analysis-code.R that uses the tidyverse package, you will need to install it.
Install the tidyverse package using the following install.packages("tidyverse")
.
To put it on the list of used packages in your project run renv::snapshot()
to update the renv.lock file.
Take a look at the renv.lock file again and notice the difference.
first-paper.qmd
file. Copy the content of the first-paper.qmd into this file and save it. The file should appear in Git tab list.Stage
column. The green icon means it was added.Commit
. You are prompted to review your changes. Add commit message and hit commit
.The file is no more listed in Git tab - naturally! It was commited.
Update the list of authors in the first section - add your name ;) - and save it.
Repeat the previous two steps. Stage the change of the file. The blue icon means it was modified.
Now commit the change. You are prompted to review your changes. Add a commit message and hit commit.
At this moment, you may want to push your changes to a remote repository (GitHub or Gitlab) to share the code for others for additional development. This is not covered in this practical.
first-paper.qmd
file saved in your project and hit Render
.Review the resulting html file. Update the content of the Quarto document.
Switch the output format to Word
Describe the statistics of the length of the peaks in a table
Discuss the distribution of signal values and p values with one sentence that should contain the actual numbers.
Include a citation.
You can add new code chunks, update the text or add new pieces of code available in R/analysis-code.R
file.
Render the document again.
Download the generated .docx
file and sent it to the trainer.