liyaguang / DCRNN

Implementation of Diffusion Convolutional Recurrent Neural Network in Tensorflow
MIT License
1.19k stars 394 forks source link

Wrong sensor IDs for MetrLA? #70

Open radandreicristian opened 2 years ago

radandreicristian commented 2 years ago

Are the sensors for MetrLA correct? In the file graph_sensor_ids.txt, all the sensor IDs start with 7, but it corresponds neither to the PeMS sensors nor the MetrLA sensors. Is there any use for that file?

ThomasAFink commented 2 years ago

Great question I also found they start with 7. Also are those the original speed timestamps in metr-la.h5 @liyaguang?

Screenshot 2022-05-06 at 22 30 45
import pandas as pd
import h5py

#h5 file path
filename = 'metr-la.h5'

#read h5 file
dataset = h5py.File(filename, 'r')

#print the first unknown key in the h5 file
print(dataset.keys()) #returns df

#save the h5 file to csv using the first key df
with pd.HDFStore(filename, 'r') as d:
    df = d.get('df')
    df.to_csv('metr-la.csv')
ThomasAFink commented 2 years ago

The ids in LA distances don't start with 7. https://github.com/liyaguang/DCRNN/blob/master/data/sensor_graph/distances_la_2012.csv

ThomasAFink commented 2 years ago

The ids in LA distances don't start with 7. https://github.com/liyaguang/DCRNN/blob/master/data/sensor_graph/distances_la_2012.csv

Never-mind they actually do further down the list.

ThomasAFink commented 2 years ago

Great question I also found they start with 7. Also are those the original speed timestamps in metr-la.h5 @liyaguang?

Screenshot 2022-05-06 at 22 30 45
import pandas as pd
import h5py

#h5 file path
filename = 'metr-la.h5'

#read h5 file
dataset = h5py.File(filename, 'r')

#print the first unknown key in the h5 file
print(dataset.keys()) #returns df

#save the h5 file to csv using the first key df
with pd.HDFStore(filename, 'r') as d:
    df = d.get('df')
    df.to_csv('metr-la.csv')

Okay it's speed data in 5min intervals: https://towardsdatascience.com/build-your-first-graph-neural-network-model-to-predict-traffic-speed-in-20-minutes-b593f8f838e5

StefanBloemheuvel commented 1 year ago

did anybody find a solution for this? Because now there are 4106 unique sensors in the dataset instead of the 207.

ThomasAFink commented 1 year ago

No, I got data from another city (Munich) and then built my own adjacency matrix: https://github.com/ThomasAFink/osmnx_adjacency_matrix_for_graph_convolutional_networks

ThomasAFink commented 1 year ago

Lots of data here from 40 cities: https://utd19.ethz.ch/

radandreicristian commented 1 year ago

@ThomasAFink @StefanBloemheuvel Sorry for the (very) late reply. I think I actually managed to get the data right somewhere last year. I actually have a repo where I have tried to get 2 of the datasets in an unified format with latest (back then) versions of NumPy. Everything I found prior to that had some version conflicts about how the data was generated/stored, so I did a remake.

Feel free to use/contribute:

https://github.com/radandreicristian/traffic-datasets