I have uploaded two scripts to my git repo containing my solutions for modifying the tableofSNPs.csv to perform data cleaning (task) and permute nucleotides (extra task). I have also included within this issue my code to check that these scripts have completed their tasks successfully. For review by @cecileane and @coraallencoleman.
task: data cleaning
Write a one-liner using sed to remove " and , from the Minimum column.
script
The script is located at scripts/fix_minimums.sh. The script must be run from the main directory. To run, type:
If a given row has exactly three commas in it, it will be replaced by the word match. If not, the unedited row will be displayed. The uniq command compresses all consecutive matches into a single line.
extra task: nucleotide permutation
Write a one-liner using sed to permute A to T and T to A.
script
The script is located at scripts/permute_nucleotides.sh. The script must be run from the main directory. To run, type:
About
I have uploaded two scripts to my git repo containing my solutions for modifying the
tableofSNPs.csv
to perform data cleaning (task) and permute nucleotides (extra task). I have also included within this issue my code to check that these scripts have completed their tasks successfully. For review by @cecileane and @coraallencoleman.task: data cleaning
Write a one-liner using
sed
to remove"
and,
from theMinimum
column.script
The script is located at
scripts/fix_minimums.sh
. The script must be run from the main directory. To run, type:checking the edited version
To ensure that the script has run correctly, check the edited version with:
If a given row has exactly three commas in it, it will be replaced by the word
match
. If not, the unedited row will be displayed. Theuniq
command compresses all consecutive matches into a single line.extra task: nucleotide permutation
Write a one-liner using
sed
to permuteA
toT
andT
toA
.script
The script is located at
scripts/permute_nucleotides.sh
. The script must be run from the main directory. To run, type:checking the edited version
To ensure that the script has run correctly, there are a few different ways to check the edited version.
Check that the new number of
A
's matches the original number ofT
's.Check that the new number of
T
's matches the original number ofA
's.Visualize the file headers to ensure that
A
's andT
's have been swapped.