These are called as a uniform discrete probability distribution.
Example:-
Let's say that the campus bus arrives at the student union bus stop at 11:30 am. However, the record indicates that the bus arrives between 11:25 to 11:35 in a manner that seems random?
The bus arrival time during any given 1- minute interval is just equally likely as the next 1 - minute interval.
Is equally 1/10,1/10,......1/10
f(x)=1/n equal outcomes
coin flip=2 equal outcomes =1/2 area=0.5
die roll=6 equal outcomes=1/6=0.167
bus stop=10 equal oucomes=1/10=0.1
So we can see that the uniform probability of each discrete outcome takes the form:
Probability outcome = 1/n:
where n is the number of outcomes.
Continuous
But think about data such as temperature, distance, income. mass, etc. that can be measured very precisely to several decimal points.What is the possible number of outcomes is infinity?
Therefore when dealing with this type of data we are left with following odd equation:
probability=1/infinity
well guess what.... 1/infinity=0
We call this continuous data.
area of rectangle=height*(b-a)=1
height=1/b-a
f(x)=1/b-a
Expected value(x)=a+b/2
Variance(x)=(b-a)*(b-a)/12
Standard deviation(x)=(b-a)/root(12)
Examples:-
On average ,30-minute TV shows have 22 minutes of actual program.Let's assume the probability distrbution for number of minutes of actual program is uniformly distributed from a low of 18 minutes to high of 26 minutes?
f(x) = 1/b-a = 1/26-18 = 1/8 or 0.125
E(x) = a+b/2= 26+18/2= 44/2=22
V(x) = 8*8/12 = 64/12 = 16/3 = 5.33
S(x)= 8/root(12) = 8/3.46 =2.312
What is the probability P(x) the show will have at least 25 minutes of programming?
x1=25, x2=26
p(x)= x2-x1/b-a = 26-25/8 = 1/8 or 0.125
What is the probability the show will have between 21 and 25 minutes of programming?
x1=21, x2=25
p(x)= 25-21/26-18 = 4/8 =1/2 or 0.5
What is the probability the show will have between 22.32 and 24.77 minutes of programming?
Why do have to two different distributions that are used in exact same type of formulas but we come out with different answers?
Examples: Randomly selected 5 students (sample) from a college freshmen class of 10,000 students (population of interest).
Find GPA of those 5 students.
Declare that "The average GPA of the entire freshmen class is....."
The limits of data in research
When doing quantative research or analysis , we are often interested in a large population:
The average GPA of a university freshmen class.
The number of customers served a Mc Donald's between Noon to 1 pm.
However due to time and cost, we almost always use sample data to represent the large population.
But sample data is always best to approximation of larger population from when it is selected.
If sample size 'n' becomes small ,we are less certain that representation of our entire popution, there is greater risk of errors.
We don't know nothing about population; its mean,variance and standard deviation.
Sample Size
Sample size refers to the number of participants or observations included in a study.
This number is usually represented by n.
The size of a sample influences two statistical properties:
the precision of our estimates.
the power of the study to draw conclusions.
Degrees of freedom
The easiest way to understand degrees of freedom conceptually is through an example:
Consider a data sample consisting of, for the sake of simplicity, five positive integers. The values could be any number with no known relationship between them. This data sample would, theoretically, have five degrees of freedom.
Four of the numbers in the sample are {3, 8, 5, and 4} and the average of the entire data sample is revealed to be 6.
This must mean that the fifth number has to be 10. It can be nothing else. It does not have the freedom to vary.
So the degrees of freedom for this data sample is 4.
The formula for degrees of freedom equals the size of the data sample:
Df=N-1
Where:
Df=Degrees of freedom
N=Sample Size
t-distribution and z-distribution:
Can the population and standard deviation be assumed known:
Yes= Use z-distribution = sigma known
No= Use sample standard deviation to estimate sigma = Use t-distribution = sigma unknown.
1) Uniform probability distribution 2) A tour to normal distribution 3) Z values and tvalues 4) stock risks and Normal distribution 5) is my data normally distributed?