Open solomonxie opened 6 years ago
▶ Refer to Wiki: List of mathematical symbols
Ɐ
: Reads as "Turned A", means "For all":=
or =:
, reads as "is defined to be". "x := y, y =: x or x ≡ y" means x is defined to be another name for y, under certain assumptions taken in context.⊗
: Tensor Product "V ⊗ W", reads as "tensor product of V and W"∃
, reads as "There exists". "∃ x:P(x)" means "There exists at least one x for P(x) is true".∃!
, means "exists exact one".Not Understood Yet
Particular Solution
for Systems of equationsAbelian Group
from Group Theory for Vector spacesRings
from Group TheoryLinear algebra, Matrix algebra, same thing.
If there IS solution or solutions to a
Linear system
, then it's consistent. Otherwise, it's Inconsistent.
Lecture video timeline | Links |
---|---|
Lecture | 0:00 |
Matrix picture | 2:47 |
Row picture | 3:41 |
Column picture | 8:41 |
Matrix picture in 3D | 15:26 |
Row picture in 3D: intersects of planes | 17:33 |
Column picture in 3D | 23:11 |
Permutation Matrix | 36:42 |
"The fundamental problem of linear algebra is to solve n linear equations in
n
unknowns."
We view this problem in three ways:
Row picture
: Each row is an equation, and we could draw out each line equation on the graph.Column picture
: Rewrite equations in the form below, and each column is a vector, and we could each vector (with scalar) on the graph.Matrix picture
: Rewrite equations into Coefficient Matrix form
, and see the geometric meaning of a matrix and vector.
Matrices elimination
(orsolving system of linear equations
) is the very first and fundamental skill throughout Linear Algebra. It's probably the first lesson of all sorts of courses.
Before learning solving systems of linear equations
, you really need to get familiar with all the core terminologies involved, otherwise it can be very hard to move on to next stage.
And in this case, the best way to learn that is through Wikipedia.
JFR, the core terms are: Gaussian elimination
, Gauss-Jordan elimination
, Augmented Matrix
, Elementary Row Operations
, Elementary matrix
, Row Echelon Form (REF)
, Reduced Row Echelon Form (RREF)
, Triangular Form
.
Refer to wiki: Gaussian elimination
It's a
Row reduction algorithm
to solve System of linear equations.
Refer to simple wiki: Gaussian elimination Example: showme.com
To perform Gaussian elimination
, the coefficients of the terms in the system of linear equations
are used to create a type of matrix called an augmented matrix
.
Then, elementary row operations
are used to simplify the matrix.
The goal of Gaussian elimination is to get the matrix in row-echelon form
.
If a matrix is in row-echelon form
, which is also called Triangular Form
.
Some definitions of Gaussian elimination say that the matrix result has to be in reduced row-echelon form
.
Gaussian elimination that creates a reduced row-echelon matrix result is sometimes called Gauss-Jordan elimination
.
To be simpler, here is the structure:
Gaussian Elimination
Augmented Matrix
.Elementary row operations
.Row Echelon Form
orReduced Echelon Form
And if we make the result only in RREF
, so the name of the algorithm could also be called:
Gauss-Jordan Elimination
Augmented Matrix
.Elementary row operations
.Reduced Echelon Form
Elementary row operations are used to simplify the matrix.
The three types of row operations used are:
Confusing operation: See where the negative sign
was put:
Suppose the goal is to find the solution for the linear system below:
First we need to turn it into Augmented Matrix
form:
Then we apply Elementary Row Operations
, and result in Row Echelon Form
:
At the end, if we'd like, we can further on apply some row operations to get the matrix in Reduced Row Echelon Form
:
Reading this matrix tells us that the solutions for this system of equations occur when x = 2, y = 3, and z = -1.
Refer to this lecture video: REF & RREF.
It doesn't really matter it is a Square Matrix
or not, there could be a Diagonal
or Main diagonal
, or you can't draw a diagonal at all.
The only thing matters is WHAT ARE ABOVE 1 AND WHAT ARE BELOW 1.
Means we put another column into the matrix, which represents the Right side of the system of equations, numbers of right of the
=
sign.
When we apply elimination to Linear equations
, we operate both sides at the same time. But for computer programmes, it often apply to Left side, and remember the operations, a.g. multiply a number or add equations together, when the left side finished then apply the same operations to the right side.
If a given Matrix was told it's an Augmented Matrix
, so we have to assume that the Last Column is The Solution Column.
elementary row operations
.
Or called the
Cursor
, orBasic
, orBasic variable
.
Refer to this video from mathispower4u.
It means the value that represents the unknown variable
in each column. There's no pivot
in a column if you can't get a 1 in that column.
If there's no pivot in a column, that means this unknown variable
of the column can be any number, so we call it a free variable
.
Pivot columns
The pivots
are found after Row Reduction
, and then go back to the Original Matrix, the columns WITH pivots are called pivot columns
.
It's simple: When you solve out one unknown variable
in the Linear System, you put the value back to other equations. We call this process as Back Substitution
.
And at this point, x₁(1,0,0) + x₂(0,1,0) + x₃(-1,2,0) = (0,0,0)
. So for what coefficients of x₁, x₂, x₃
would produce a zero vector? By eyeballing it we could tell, x₁=1 & x₂=-2
would have done.
Refer to this video by mathispower4u
Practice:
A fairly simple way to remember how to do matrix multiplication
:
Assume that two matrices multiply together as AB = C
.
You need to write out each entries of the product, and then place this entry with a row of A and a column of B which numbered as subscriptions of this entry.
e.g., in the Product Matrix
, C₁₁ represents Row-1 of A multiplies Column-1 of B
;
C₂₁ represents Row-2 of A multiplies Column-1 of B
.
So on and so forth, write down all the entries of the product entries, and then use dot product
to calculate each one.
Refer to Khan academy article.
Elimination
is the method EVERY softwares use to solvelinear equations
.
prerequisites:
Lecture video timeline | Links |
---|---|
Lecture | 0:00 |
Elimination pivots and an example | 3:09 |
Failure of Elimination method | 10:34 |
Augmented matrix | 14:50 |
Operations of matrices elimination | 19:24 |
Row operations of Matrices Multiplication | 20:22 |
Column operations of Matrices multiplication | 21:43 |
Elementary Matrix | 24:46 |
Include all elimination steps in one Matrix | 33:29 |
To do
column operations
, the matrix multiplies on the right. To dorow operations
, the matrix multiplies on the left.
Below it's a
Column Vector multiplied by a 3x3 Matrix
:
The result above is a 3x1 Matrix
, which is a Column vector
again. Because:
THE RESULT OF THAT COLUMN OPERATION IS A LINEAR COMBINATIONS OF THE COLUMNS.
"A MATRIX TIMES A COLUMN, IS A COLUMN."
Below it's a
Row Vector to multiply a 3x3 Matrix
:
The result above is a 1x3 Matrix
, which is a Row vector
again. Because:
THE RESULT OF THAT ROW OPERATION IS A COMBINATION OF THE ROWS.
It's also called
Elimination Matrix
.
Refer to this amazing good video by Mathispower4u: Elementary Matrices Refer to Mathispower4u: Write a Matrix as a Product of Elementary Matrices
Simply saying, an Elementary Matrix
is just an Identity Matrix
with ONLY ONE ELEMENT CHANGED.
Elementary Matrix
must be ONLY ONE ROW OPERATION away from the Identity Matrix
.
The example above is an elementary matrix
which only altered the Row-2 Column-1 entry
, and we'd like to call it the E₂₁ matrix
, which represents the elementary matrix which fixed the 2-1 position
.
The reason we need an elementary matrix
is to apply each one step of Elimination of linear equations
.
Which means that,
FOR EVERY SINGLE STEP OF ELIMINATION, WE NEED AN ELEMENTARY MATRIX.
So for two steps of elimination, we could represent it with elementary matrices
as below:
Combining all elimination steps in ONE MATRIX:
Permutation Matrix
is ANOTHER TYPE OF ELEMENTARY MATRIX, and used only to switch positions of elements in the matrix, without changing any numbers.
Review Dr. Strang's lecture.
Example: To switch two ROWS of a matrix by using a permutation matrix
:
Example: To switch two COLUMNS of a matrix by using a permutation matrix
:
Prerequisites | Links |
---|---|
Matrix multiplication basics(row * col) | Note |
Elementary matrices | Video |
Refer to Juanklopper's jupyter notebook.
Lecture timeline | Links |
---|---|
Lecture | 0:0 |
Method 1: Multiply matrix by vector | 0:50 |
When allowed to multiply matrices | 4:38 |
Method 2: Multiply matrix by COLUMN | 6:12 |
Method 3: Multiply ROW by matrix | 10:04 |
Method 4: Multiply COLUMN by ROW | 11:37 |
Method 5: Block Multiplication | 18:25 |
Inverse Matrices (Square matrices) | 21:15 |
Invertible Matrix | 22:00 |
Singular Matrix (No-inverse matrix) | 24:39 |
Calculate Inverse of Matrix | 31:52 |
Gauss-Jordan Elimination to solve Inverse of a matrix | 35:20 |
Calculation of an entry of the Product Matrix.
Each column of the product matrix C
, is Matrix A * Column of B
.
Each row of the product matrix C
, is Row of A * Matrix B
.
You can cut each matrix to blocks, each block is no necessary to be equal sized as long as they can match each other well.
After you cut matrices into blocks, the multiplication will just be like a smaller matrix multiplication: Each block can be seen as a number in a matrix.
If a matrix's inverse exists, then we call this matrix Invertible
, or Non-singular
.
And only with square matrices
, the inverse can be both right side or left side with the original matrix to produce the Identity Matrix
.
Simplest way to tell if it's a singular matrix
is to calculate its Determinant
which we learnt in high school: It's a singular matrix if its determinant is ZERO.
But there's another way to tell:
THIS METHOD IS SO MUCH EASIER TO GET THE INVERSE THAN THE WAY WE LEARNT IN HIGH SHCOOL WHICH LETS YOU TO CALCULATE ALL DETERMINANT, ADJUGATE AND COFACTOR AND SO ON.
Refer to Khan academy lecture: Inverting a 3x3 matrix using Gaussian elimination.
Practice for Gauss-Jordan Elimination to get Inverse of a matrix:
With this formula above, we got TWO equations, which will help us form a system of equations! That's where Gauss comes in: we AUGMENT TWO COLUMNS to the matrix.
Why could we use Gauss-Jordan Elimination to solve Inverse of matrix
?
The E
above represents all elementary matrices
.
Refer mathispower4uFor for refreshing on: How to get elementary matrices
Prerequisites -- | Matrix Inverses | Matrix multiplication | Elementary Matrix | Permutation Matrix |
Lecture timeline | Links |
---|---|
Lecture | 0:00 |
What's the Inverse of a Product | 0:25 |
Inverse of a Transposed Matrix | 4:02 |
How's A related to U | 7:51 |
3x3 LU Decomposition (without Row Exchange) | 13:53 |
L is product of inverses | 16:45 |
How expensive is Elimination | 26:05 |
LU Decomposition (with Row exchange) | 40:18 |
Permutations for Row exchanges | 41:15 |
"
A = LU
is the BIG FORMULA for elimination. It's a great way to look at Gaussian Elimination."
Assume A & B
are all invertible matrices, so what is (AB)⁻¹
?
Yes, we multiply their inverses together A⁻¹ & B⁻¹
, but in what order do we multiply these inverses?
IN REVERSE ORDER.
Which makes:
(AB)(B⁻¹A⁻¹) = 𝐈
or (B⁻¹A⁻¹)(AB) = 𝐈
. They perform in the same way get the same result.
so:
(AB)⁻¹ = (B⁻¹A⁻¹)
So the Inverse of (Aᵀ)⁻¹ = (A⁻¹)ᵀ
"L is the product of Inverses."
L = E⁻¹
, which means L is the inverse of elementary matrix
.
Assume in the elimination process without row exchanges, we only apply elementary matrices
to the matrix one by one.
So the L
would be the Inverse of those elementary matrices, but in Reverse order.
EA = U
A = LU
So that steps above is the
Inverse Elementary Matrices
picture of getting the L. But actually what actually we get is really simple to observe: If no row exchanges, multipliers go directly into L.
So as we've understood the meaning behind it, we can forget it and just remember the multipliers
.
For LU Decomposition, although we can't represent row exchanges with
Elementary Matrices
, but we can do it withPermutation matrices
.
For a 3x3 Identity Matrix, there're 6 permutations of it:
The Inverse of a Permutation
is its Transpose
:
For a Matrix A, we could factor it out as A = LU
, just like we factor a number to two numbers.
Online LU Decomposition Calculator
The factor matrix U
represents the Upper Triangular Matrix
, which we're already familiar with: the matrix we've got after Gauss Elimination
.
Refer to video: LU Decomposition using Gaussian Elimination
The factor matrix L
is not hard to get as well:
All the numbers in this matrix are factor numbers
we used in each elimination step.
Refer to this video: LU Decomposition - Shortcut Method by Math is power
The final goal of learning
LU Decomposition
is to solve Linear systems.
Refer to this video: Solve a System of Linear Equations Using LU Decomposition
Assume there's equation AX = B
as below, and we're to solve for X
:
Steps to apply the LU Decomposition
to solve the Linear System:
AX = B
as LUX = B
Y = UX
, then solve LY = B
for Y
Y = UX
for X
Lecture timeline | Links |
---|---|
Lecture | 0:00 |
Permutations | 1:17 |
Possibilities of permutations | 7:23 |
Transposes | 10:15 |
General formula for transpose | 11:38 |
Symmetric matrices | 12:43 |
RᵀR is always symmetric | 15:06 |
Chapter 3: Vector spaces | 20:12 |
What "space" means | 22:03 |
Why is Origin necessary in Vector spaces | 25:33 |
Most important thing about Vector space | 28:29 |
A case that's not a Vector space | 29:41 |
All possible subspaces in R² | 35:54 |
All possible subspaces in R³ | 39:04 |
Subspaces come out from Matrices: Column Space | 39:45 |
"Permutation executes Row exchanges."
For LU Decomposition the A = LU
DOESN'T work with Row exchanges
, so we change it to:
PA = LU
# P = Permutation Matrix = Identity Matrix with Reordered rows
Which apply row exchanges to matrix A into the right order (for pivots), then decompose it.
Possibilities of Permutations of nxn matrix = n!
P⁻¹ = Pᵀ
PᵀP = 𝐈
The way to do a transpose is just SWITCH ENTRIES.
Remember: intuitively the matrix is NOT Rotating to be a transpose, but Flipping by the Diagonal of the matrix. Which means the entries on the DIAGONAL maintain the same.
There're some well-known matrices are defined by their Transpose.
It means the transpose of the matrix doesn't change it.
#symmetric matrix
Aᵀ = A
Given any matrix R (not necessarily square) the product RᵀR is always symmetric, because after transposing it's still the same:
(RᵀR)ᵀ = Rᵀ(Rᵀ)ᵀ = RᵀR
# Note: (Rᵀ)ᵀ = R, and matrix multiplications is from right to left.
We can do operations to any vector and still in the same space. We can add or scale or combine any R² vectors and we're still in R² space.
In another word, if you do some additions or scalings to a vector but turns out it jump out of the space, then It can't be a vector space. e.g., take the positive part of R² as a space, if we do additions to the vectors in it they will still be positive. BUT, if we apply a negative scalars to vectors, they will come out of the positive space. So it's not a Vector space.
EVERY VECTOR SPACE GOT TO HAVE THE ZERO VECTOR IN IT.
Refer to video by TheTrevTutor: Vector Spaces
If a Vector space is INSIDE of a Vector space e.g. R², we call it The Subspace of R²
.
All Subspaces of R² | All Subspaces of R³ |
---|---|
The whole R² space | The whole R³ space |
- | Any plane goes through the Origin (0,0) |
Any line goes through Origin (0,0) |
Any line goes through Origin (0,0,0) |
The Zero vector itself (𝐙) | The Zero vector itself (𝐙) |
Remember the NO.1 rule of a Subspace: ALL VECTOR COMBINATIONS FORM A SUBSPACE.
Column space is a special kind of subspace, which comes out of matrix.
Which means the column space of a matrix only have 3 vectors: 2 column vectors and a Zero vector, and the Column space(their linear combinations) forms a 2D plane.
Lecture timeline | Links |
---|---|
Lecture | 0:00 |
What are Vector spaces | 1:05 |
Subspaces of R³ | 2:33 |
Is the union of two subspaces a Subspace? | 4:23 |
Column space | 11:36 |
Features a Column space | 14:46 |
How much smaller is the Column space? | 15:48 |
Does every Ax=B have a solution for every B ? |
16:17 |
Which B s allow the system of equations solved |
19:39 |
Null space | 28:12 |
Understand what's the point of a Vector space | 40:24 |
For a 3x3 matrix,
We pick out three Column Vectors, and take all their
Linear combinations
, then we formed a Column space
.
Definition: Column Space of A is all Linear combinations of A's columns.
NO! Not always, but sometimes.
"The system of linear equations Ax = b is solvable exactly when b is a vector in the column space of A"
"If you give me a Matrix A, and let me to find N(A). Literally my goal is to find the set of All x's satisfied the equation Ax=0."
Refer to Khan academy lecture Refer to this video explanation by TheTrevTutor
Refer to the note in Pre Linear algebra
about understanding Dot product.
Assume that the vector w
projects onto the vector v
.
Notation:
Componentᵥw
, read as "Component of w
onto v
".Projectionᵥw
, read as "Projection of w
onto v
".Notice that: When you read it, it's in a reverse order! Very important!
Note that, the formula concerns of these concepts as prerequisites:
The name is just the same with the names mentioned above:
boosting
.
Componentᵥw = (dot product of v & w) / (w's length)
Refer to lecture by Imperial College London: Projection Refer also to Khan academy: Intro to Projections
What if we know the vectors, and we want to know how much is the Scalar projection
(the shadow)?
Example:
How we're gonna solve this is: We know the vectors, so we can get their
dot product
easily by taking their linear combination; and we know the length of each vector, by using Pythagorean theorem; and then we get the projection, as in the picture.
It's another idea for projection, and less intuitive.
Remember that a Scalar projection
is the vector's LENGTH projected on another vector. And when we add the DIRECTION onto the LENGTH, it became a vector, which lies on another vector. Then it makes it a Vector projection
.
It can be understood as this formula:
Projectionᵥw = (Componentᵥw) * (Unit vector of v)
But usually we write it as this:
Refer also to video for formula by Kate Penner: Vector Projection Equations Refer to video by Firefly Lectures: Vector Projections - Example 1
Example:
Changing basis of a vector, the vector's length & direction remain the same, but the numbers represent the vector will change, since the meaning of the numbers have changed. Our goal is to calculate the New numbers in the vector in terms of the new basis.
Refer to video by Trefor Bazett: Deriving the Change-of-Basis formula
The goal is to write a vector in a new basis.
Refer to lecture form Imperial College London: Changing basis
Remember the `Projection
Just to save some words, here's the example and solution:
Example:
Solution:
The idea is to take projection of the vector onto both new basis, except it's taking only a part of the
projection vector formula
. As in the formula below, it only takes the blue squared part as the number of the new vector's component.
Component V₁ = (V﹒b₁) / |b₁|² = (5*1 + -1*1) / ( √(1²+1²) )² = 4/2 = 2
Component V₂ = (V﹒b₂) / |b₂|² = (5*1 + -1*-1) / ( √(1²+(-1)²) )² = 6/2= 3
V' = (2, 3)
Refer to lecture: Matrices changing basis Refer to video: Change of Coordinates Matrix #2
It's a Square Matrix consisted with Unit vectors. Usually it's just Identity Matrix.
Refer to Wiki: Orthogonal Matrix. Refer to lecture by Imperial College London: Orthogonal Matrices
If two vectors are Unit vectors AND Orthogonal(perpendicular) to each other, they will be called
Orthonormal
. If they form a set of Basis, they'll be calledOrthonormal basis
.
If the Matrix composed with orthonormal basis, then its transpose
is its inverse
:
Aᵀ = A⁻¹
# which makes this one true as well
AAᵀ = 𝐈
AᵀA = 𝐈
The determinant
of an Orthogonal matrix, must be either 1 or -1.
The -1 determinant means the matrix was flipped around from originally right-handed
to left-handed
.
|A| = ±1
" My life would probably be easier if I could construct some orthonormal basis somehow. And there's a process for doing that which is called the Gram-Schmidt process" - David Dye, Imperial College London
Refer to Wiki: The Gram–Schmidt process
How to see this process intuitively?
Think about the famous paint The Ambassadors (Holbein) - Wikipedia
The skull in the paint is so hard to recognize because it's in some "funny" basis, we need to transform it into our world basis, the
Orthonormal basis
, so that we could see it as below:
Refer to video by Trefor Bazett: Using Gram-Schmidt to orthogonalize a basis, Full example: using Gram-Schmidt The geometric view on orthogonal projections
For a
linear transformation
, aneigenvector
is a vector which, after applying the transformation, stays in the same span.
When we say eigenvectors
, we always need to say eigenvectors of a linear transformation
.
It's the same with determinant of linear transformation
.
A
diagonal matrix
is a matrix in which the entries outside themain diagonal
are all zero.
Imagine we are applying a Transformation matrix many many times, if we follow the basic Matrix Multiplication rule that will be a shit ton of calculations. But Diagonal matrix
rule save us out.
Refer to lecture: Changing to the eigenbasis
If you're lucky enough, that the Transformation Matrix
is a Diagonal matrix
,
then when you're raising powers to the Matrix, you just need to simply raise power to the Diagonal elements
.
If you aren't lucky, that the Transformation Matrix isn't a Diagonal matrix,
then we're gonna do Eigen-Analysis
,
which means:
The steps will be like:
"Linear algebra is a pillar of machine learning." - Jason
Check THIS LINK for reading book: Jason-Brownlee-Basics-for-Linear-Algebra-for-Machine-Learning-Discover-the-Mathematical-Language-of-Data-in-Python-2018
Eigendecomposition
& Singular Value Decomposition (SVD)
Principal Component Analysis (PCA)
Linear Least Squares Regression
Matrices that contain mostly zero values are called sparse
, distinct from matrices where most of the values are non-zero, called dense
.
Very large matrices require a lot of memory, and some very large matrices that we wish to work with are sparse. In practice, most large matrices are sparse — almost all entries are zeros.
Most common types of matrix decomposition:
The factors L and U are triangular matrices. The factorization that comes from
elimination
.
The Cholesky decomposition is for square symmetric matrices where all values are greater than zero, so-called positive definite matrices.
Where L is the Lower triangular matrix, and Lᵀ is its transpose.
Or
Where U is the Upper Triangular matrix, and Uᵀ is its tranpose.
Eigendecomposition of a matrix is a type of decomposition that involves decomposing a square matrix into a set of eigenvectors and eigenvalues. One of the most widely used kinds of matrix decomposition is called eigendecomposition, in which we decompose a matrix into a set of eigenvectors and eigenvalues.
Not all square matrices can be decomposed into eigenvectors and eigenvalues
The parent matrix can be shown to be a product of the eigenvectors and eigenvalues:
Almost all vectors change direction, when they are multiplied by A. Certain exceptional vectors x are in the same direction as Ax. Those are the “eigenvectors”.
The Singular Value Decomposition is a highlight of linear algebra.
The singular value decomposition (SVD) provides another way to factorize a matrix, into singular vectors and singular values. The SVD allows us to discover some of the same kind of information as the eigendecomposition. However, the SVD is more generally applicable.
More intuitive way to think of vector, matrix, tensor:
Imagine a rectangle:
Vector Space
: Use small set of vectors, to represent a "full closure" of all vectors I expect in an operation.Matrix
: A collection of vectors.
Study resources
Tools
Practice & Quizzes
Study goals of Linear Algebra
MIT OCW Linear Algebra 18.06