Need to develop our own dataset of rides that did not end in a crash so that we have datapoints where our target value is 0
More details
Our crash dataset only has datapoints for when there was a crash. It is difficult/impossible to get a probabilistic interpretation of this data for likelihood to get in a crash because we do not have datapoints for rides that resulted in no crash.
Using the CitiBike rides dataset, we can map those rides and get data for the network nodes where crash=0
Summary
Need to develop our own dataset of rides that did not end in a crash so that we have datapoints where our target value is 0
More details
Our crash dataset only has datapoints for when there was a crash. It is difficult/impossible to get a probabilistic interpretation of this data for likelihood to get in a crash because we do not have datapoints for rides that resulted in no crash.
Using the CitiBike rides dataset, we can map those rides and get data for the network nodes where
crash=0