cnolmsted / git_notes

0 stars 0 forks source link

sed homework due 2018_10_03 #2

Open cnolmsted opened 6 years ago

cnolmsted commented 6 years ago

Here is a list of one-liner shell commands

run the following command when in the same folder as tableofSNPs.csv

sed -E 's/,"([0-9]+),([0-9]+),([0-9]+)",/,\1\2\3,/' tableofSNPs.csv | sed -E 's/,"([0-9]+),([0-9]+)",/,\1\2,/' > tableofSNPs_fixed.csv This command creates a new file, tableofSNPs_fixed.csv free of pesky quotes and British-style commas in numbers. It works by first changing all the "XX,XX,XX" formatted numbers, and then changing all the "XX,XX,XX" formatted numbers.

run the following command to check if the above command worked properly

sed 's/.*,.*,.*,.*/1/' tableofSNPs_fixed.csv | uniq | less This command makes sure that each line of the .csv file has 4 fields delimited by 3 commas, and if it does, the number 1 will appear in the less viewer.

run the following command to swap T's with A's and A's with T's

sed s/T/F/g tableofSNPs_fixed.csv | sed s/A/T/g | sed s/F/A/g > tableofSNPs_fixed_TAswapped.csv This command would only work if all the nucleotide basepairs are capital letters, and there are no F's in the .csv file, and it creates a new file, tableofSNPs_fixed_TAswapped.csv, where they are swapped. Of course, make sure you ran the previous commands before this one.

checking if the above command worked:

I would simply open a second terminal, and use less to look at tableofSNPs_fixed.csv and tableofSNPs_fixed_TAswapped.csv files simultaneously, and make sure a few A's and T's were successfully swapped. If they were, then the rest should be too.

Tags:

@cecileane @coraallencoleman