rbi-learning / Today-I-Learned

1 stars 0 forks source link

10/13 Day#1 Working with Excel #178

Open Limlight86 opened 4 years ago

Limlight86 commented 4 years ago

Day 1 Notes

Morning Exercise Review - Working with the Yelp Api Data Set.

CSV - comma separated Values.

Importing a CSV to excel.

In the Data header tab, use the text import wizard using the “From Text” button.

Because of CSV formant, we want to use the Delimited option for importing the data, separating them by commas. Consecutive delimiter are done on purpose with a CSV, indicating empty fields. Do not consider them as 1 item, but operate empty items. Text qualifiers should be set to double quotes, these indicate categories in data.

When setting up the columns, we need to set the ID and postal code columns to “text”. Some IDs can start with “=“ and excel can interpret that as formula and postal codes can be considered numbers.

Renaming worksheets is as easy as right clicking and choosing the rename option.

In order to sort through values in a column field and make them uniform. Using a filter, you can sort the table by a columns value. Data > Filter. Clicking on the arrow that appears in the column, you can select which unique values you want to work with, ie.e all Burger King. Now we can replace all those values with the same uniform value. This process can also be used to remove any erroneous data, like the yelp api retrieving data from similar named stores.

Pre-work Review

Median data - =MEDIAN(RawData!F2:F63).

The MEDIAN function returns the median (middle number) in a group of supplied numbers. For example, =MEDIAN(1,2,3,4,5) returns 3.

Creating a Named range for data - Highlight the range of data you want, Formulas > Name to create a name for said range. There is also a shortcut to the top left of the excel sheet where you can rename it there. =MEDIAN(NamedData)

Creating Named Ranges

Total Duration calculations

We need convert all the time into a value that we can then convert to hours. We can take a range for the seconds and minutes and give them a specific range. The goal is to count down the time and convert the values from hours to minutes to seconds.

For hours =(ROUNDDOWN(SUM(Minutes, (SUM(Seconds)/60))/60, 0)

For minutes =(ROUNDDOWN(SUM(Minutes, (SUM(Seconds)/60)) -C5,0)

For seconds =(ROUNDDOWN(SUM(Minutes, (SUM(Seconds)/60)) -C5 *60 -D5)*60)

The Excel ROUNDDOWN function returns a number rounded down to a given number of places. Unlike standard rounding, where only numbers less than 5 are rounded down, ROUNDDOWN rounds all numbers down.

Conditional Averages

=AVERAGEIF()

The Excel AVERAGEIF function calculates the average of numbers in a range that meet supplied criteria. AVERAGEIF criteria can include logical operators (>,<,<>,=) and wildcards (*,?) for partial matching.

V Lookup

VLOOKUP is an Excel function to look up data in a table organized vertically. VLOOKUP supports approximate and exact matching, and wildcards (* ?) for partial matches. Lookup values must appear in the first column of the table passed into VLOOKUP.

=VLOOKUP (value, table, col_index, [range_lookup])

=VLOOKUP(A15, DataName, 8)

Working with Restaurants CSV.

Substituting single characters in a column. Substitute calls can be chained to do multiple changes in 1 formula.

The Excel SUBSTITUTE function replaces text in a given string by matching. For example =SUBSTITUTE("952-455-7865","-","") returns "9524557865"; the dash is stripped. SUBSTITUTE is case-sensitive and does not support wildcards.

=SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
L5,
"null", ""),
"{", ""),
"}", ""),
"'", ""),
"Monday:", ""),
"Tuesday:", ""),
"Wednesday:", ""),
"Thursday:", ""),
"Friday:", ""),
"Saturday:", ""),
"Sunday:", "")

Paste options. When copying cells that have been calculated by formulas, it is often beneficial to copy over the values to a new cell so that those explicit values can be used in other operations inside your excel project.

The IF function runs a logical test and returns one value for a TRUE result, and another for a FALSE result. For example, to "pass" scores above 70: =IF(A1>70,"Pass","Fail"). More than one condition can be tested by nesting IF functions. The IF function can be combined with logical functions like AND and OR to extend the logical test.

The OR function is a logical function to test multiple conditions at the same time. OR returns either TRUE or FALSE. For example, to test A1 for either "x" or "y", use =OR(A1="x",A1="y"). The OR function can be used as the logical test inside the IF function to avoid extra nested IFs, and can be combined with the AND function.

The Excel AND function is a logical function used to require more than one condition at the same time. AND returns either TRUE or FALSE. To test if a number in A1 is greater than zero and less than 10, use =AND(A1>0,A1<10). The AND function can be used as the logical test inside the IF function to avoid extra nested IFs, and can be combined with the OR function.

These can be all combined for a series of logic that depend on each other to arrive at a final outcome.

=IF(OR(O26="",P26="",Q26="",R26="",S26="",T26="",U26="",V26="",W26="",X26="",Y26="",Z26="",AA26="",[@[Monday Open ]]=""),FALSE(),AND(O26=0,P26=0,Q26=0,R26=0,S26=0,T26=0,U26=0,V26=0,W26=0,X26=0,Y26=0,Z26=0,AA26=0,[@[Monday Open ]]=0))

The Excel FALSE function returns the value FALSE. FALSE is classified as a compatibility function, and not needed in most cases. It is equivalent to using the Boolean value FALSE directly in a formula.

COUNTIF is an Excel function to count cells in a range that meet a single condition. COUNTIF can be used to count cells that contain dates, numbers, and text. The criteria used in COUNTIF supports logical operators (>,<,<>,=) and wildcards (*,?) for partial matching.

=IF(COUNTIF(Restaurants[@[Monday:Open]:[Sunday:Close]], "", FALSE(), COUNTIF(Restaurants[@[Monday:Open]:[Sunday:Close]], 0)= 4)