DS4PS / cpp-525-sum-2020

Course shell for CPP 525 Foundations of Program Evaluation III for Summer 2020.
http://ds4ps.org/cpp-525-sum-2020/
1 stars 0 forks source link

Lab 01 #2

Open JaesaR opened 4 years ago

JaesaR commented 4 years ago

@Dselby86 @lecy

Hi, I'm having trouble starting the lab - I've loaded the data according to the instructions, but when I go to make the table for Q1a I am running into an error.

My code is:

# Q1a: create Time, TimeSince, and Treatment variables

passengers$Y <- round(passengers$Y ,1)

d.temp <- rbind( head(passengers), c("...","...","...","..."),
                 passengers[ 198:200, ], 
                 c("**START**","**TREATMENT**","---","---"), 
                 passengers[ 201:204, ],  
                 c("...","...","...","..."),
                 tail(passengers) )

row.names( d.temp ) <- NULL  
pander( d.temp )

and it is returning with the error: Error in passengers[198:200, ] : incorrect number of dimensions

I think it has something to do with the way that the data is loaded as a list rather than a matrix...? I'm really not sure but any help here would be appreciated!

Dselby86 commented 4 years ago

In the lecture, the data is likely a dataframe. The data for the lab is a vector, meaning you have to build the table. When that block of code is trying to do is to access different parts of the dataframe. The problem is that passengers is a vector that doesn't have columns and rows.

The first step of this problem is to create the time variable. Since there are 365 observations, and they are sequential you can use 1:365 to create the time variable. From there you should be able to use the information provided to create the treatment variable and the TimeSince Variable.

I would do this by working with data frames, not lists. Especially since the analysis will require you to use a data frame.

Also, be careful when copying code from the lecture without understanding what it is doing.

round(passengers$Y ,1)

Rounding is an odd method to call on this variable because you can't have part of a person.

lecy commented 4 years ago

Note the example in the instructions have eclipses to signal that not all data is printed. They are not part of the actual dataset.

meliapetersen commented 4 years ago

Hi @Dselby86 and @lecy

I am having an issue with plotting the results to show the counterfactual points. When I code the first plot, I am getting a blank plot with no regression points. Here is my code:


data1 <- as.data.frame( cbind( time.data = 121, treatment = 1, timesince = 1 )) 

y1 <- predict( regTS, data1 ) 

plot( passengers,
      bty="n",
      col = gray(0.5,0.5), pch=19,
      xlim = c(1, 365), 
      ylim = c(0, 300),
      xlab = "Time (days)", 
      ylab = "Bus Traffic")

points( 121, y1, col = "dodgerblue4", 
        pch = 19, bg = "dodgerblue4", cex = 2 )
text( 121, y1, labels = "t = 201", pos = 4, cex = 1 )

abline( v=121, col="red", lty=2 )

And a screenshot of what my results look like. Screen Shot 2020-07-09 at 11 25 59 AM

Thanks

lecy commented 4 years ago

Your code looks fine. We would need a bit more info to diagnose.

Can you give us a preview of your dataset and the code used to create regTS?

Regarding the 'predict()' function, it is useful if your regression has 20 variables and lots of moving parts. If its a handful of variables it is easier to avoid errors using an algebraic approach:

b0 <- xxx  # coefficients go here
b1 <- xxx
b2 <- xxx
yhat.male <- b0 + b1(20) + b2(1)  # men with 20 years experience
yhat.fem <- b0 + b1(20) + b2(0)  # women with 20 years experience

plot( experience, wages )
points( 20, yhat.male )

You can also parse coefficients if you don't want to type them (often helpful in case your model changes):

m <- lm( y ~ x1 + x2 + ... + xk )
b <- coefficients( m )
b0 <- b[1]
b1 <- b[2]
# etc
meliapetersen commented 4 years ago

@lecy Here are my variables:

# Prepare data 
passengers <- 
c(1328, 1407, 1425, 1252, 1287, 1353, 1301, 1294, 1336, 1371, 
1408, 1326, 1364, 1295, 1320, 1260, 1347, 1316, 1287, 1292, 1259, 
1349, 1274, 1365, 1317, 1341, 1316, 1313, 1285, 1369, 1309, 1446, 
1422, 1397, 1358, 1310, 1294, 1373, 1161, 1320, 1376, 1335, 1382, 
1455, 1374, 1267, 1318, 1370, 1297, 1391, 1269, 1341, 1238, 1391, 
1296, 1260, 1330, 1447, 1296, 1389, 1278, 1319, 1333, 1372, 1325, 
1299, 1299, 1312, 1352, 1355, 1404, 1317, 1330, 1325, 1368, 1311, 
1310, 1242, 1247, 1366, 1401, 1282, 1298, 1301, 1341, 1353, 1398, 
1352, 1300, 1442, 1365, 1411, 1360, 1100, 1334, 1336, 1274, 1303, 
1487, 1341, 1436, 1294, 1390, 1338, 1400, 1325, 1352, 1353, 1288, 
1304, 1338, 1355, 1212, 1386, 1426, 1380, 1425, 1287, 1337, 1288, 
1348, 1308, 1402, 1370, 1401, 1363, 1312, 1457, 1367, 1320, 1338, 
1447, 1371, 1402, 1461, 1382, 1260, 1341, 1309, 1317, 1509, 1403, 
1324, 1347, 1351, 1307, 1267, 1312, 1472, 1403, 1327, 1501, 1470, 
1438, 1416, 1369, 1355, 1317, 1448, 1423, 1401, 1356, 1400, 1356, 
1452, 1435, 1387, 1372, 1390, 1538, 1460, 1474, 1510, 1360, 1424, 
1275, 1381, 1453, 1430, 1404, 1350, 1375, 1327, 1312, 1464, 1478, 
1536, 1397, 1229, 1337, 1442, 1316, 1455, 1312, 1505, 1440, 1408, 
1429, 1280, 1560, 1422, 1363, 1349, 1326, 1400, 1464, 1488, 1352, 
1485, 1446, 1540, 1435, 1377, 1287, 1480, 1353, 1359, 1493, 1387, 
1314, 1478, 1306, 1462, 1533, 1261, 1488, 1482, 1461, 1452, 1540, 
1438, 1423, 1425, 1353, 1489, 1546, 1401, 1459, 1527, 1341, 1516, 
1406, 1414, 1442, 1272, 1371, 1435, 1446, 1287, 1496, 1442, 1614, 
1305, 1459, 1342, 1478, 1501, 1357, 1428, 1444, 1431, 1425, 1434, 
1488, 1508, 1454, 1436, 1485, 1522, 1437, 1396, 1407, 1382, 1444, 
1494, 1303, 1552, 1282, 1352, 1412, 1378, 1579, 1543, 1425, 1404, 
1380, 1593, 1555, 1532, 1514, 1485, 1504, 1442, 1401, 1453, 1493, 
1522, 1417, 1545, 1422, 1540, 1447, 1447, 1575, 1431, 1516, 1542, 
1519, 1485, 1526, 1400, 1563, 1471, 1517, 1506, 1514, 1444, 1348, 
1588, 1574, 1275, 1331, 1436, 1475, 1570, 1513, 1469, 1573, 1432, 
1467, 1513, 1475, 1572, 1430, 1512, 1532, 1487, 1474, 1508, 1410, 
1455, 1445, 1544, 1500, 1517, 1496, 1606, 1613, 1526, 1487, 1540, 
1511, 1534, 1620, 1409, 1542, 1517, 1493, 1443, 1463, 1391, 1583, 
1516, 1700, 1422)

time.data <- c(1:365)

treatment <- 
    ifelse(time.data<121, 0, 1)

timesince <- 
        ifelse(time.data<121, 0, time.data - 120 )

And here is my stats table code:

#Summary statistics table in stargazer 

regTS <- lm ( passengers ~ time.data + treatment + timesince)  # Our time series model

stargazer( regTS, 
           type = "text", 
           dep.var.labels = ("Bus Ridership"),
           column.labels = ("Model results"),
           covariate.labels = c("Time", "Treatment", 
                                "Time Since Treatment"),
           omit.stat = "all", 
           digits = 2 )
lecy commented 4 years ago

Also, check your range of Y. Then double-check this argument in plot():

ylim = c(0, 300),
lecy commented 4 years ago

OK, data looks fine. I think you are just highlighting a window of your plot that doesn't contain data.

See the other comment above.

Dselby86 commented 4 years ago

Instead of ylim = c(0,300) you may want to do ylim = (min(passengers), max(passengers). That will ensure the window covers the whole width and breadth of the data. Of course there are problems with the Y-axis not starting at 0.

On Thu, Jul 9, 2020 at 6:00 PM Jesse Lecy notifications@github.com wrote:

OK, data looks fine. I think you are just highlighting a window of your plot that doesn't contain data.

See the other comment above.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DS4PS/cpp-525-sum-2020/issues/2#issuecomment-656423458, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4EHB34PAS23CNRL5FVYQ3R2ZR3TANCNFSM4OSNCVTQ .