Closed louisgry closed 5 years ago
Hi, the URL in the README is actually the raw data. You can generate the processed training data using python -m scripts.generate_training_data --output_dir=data/METR-LA
On Sun, Oct 7, 2018 at 12:51 AM Louis notifications@github.com wrote:
Thanks. But what I mean is that I want to get rawdata. Is it available?
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/liyaguang/DCRNN/issues/10#issuecomment-427633488, or mute the thread https://github.com/notifications/unsubscribe-auth/AFSHIWkINGeWmLXs3fyktBStbctv5r4cks5uibJkgaJpZM4XLs53 .
hi,where can I get an introduction to the dataset?
Hi, where can I find information on how distances_la_2012.csv
file is created? Are those distances between every single sensor with every single other one?
这是来自QQ邮箱的自动回复邮件。 邮件已收到。
This could be helpful to convert to csv and read the h5 files. 207 detectors. Speed in 5min intervals
import pandas as pd
import h5py
#h5 file path
filename = 'metr-la.h5'
#read h5 file
dataset = h5py.File(filename, 'r')
#print the first unknown key in the h5 file
print(dataset.keys()) #returns df
#save the h5 file to csv using the first key df
with pd.HDFStore(filename, 'r') as d:
df = d.get('df')
df.to_csv('metr-la.csv')
这是来自QQ邮箱的自动回复邮件。 邮件已收到。
If you're using the numpy arrays in the npy example: https://pytorch-geometric-temporal.readthedocs.io/en/latest/_modules/torch_geometric_temporal/dataset/metr_la.html. The first array (node_values.npy) is simply the speed values from the mtr-la.h5 file.
The second array is the adjacency matrix (adj_mat.npy) as discussed in the article. It's created from the graph_sensor_ids.txt and distances_la_2012.csv. Then it's somehow flattened which I'm still trying to figure out.
import numpy as np
from pathlib import Path
try:
import common
DATA = common.dataDirectory()
except ImportError:
DATA = Path().resolve() / 'data'
#Speeds from mtr-la.h5 https://drive.google.com/drive/folders/10FOTa6HXPqX8Pf5WRoRwcFnW9BrNZEIX view convert h5_to_csv.py
#npy formated examples from https://graphmining.ai/temporal_datasets/METR-LA.zip https://graphmining.ai/temporal_datasets/
NODE_VALUES = DATA / 'traffic_data' / 'METR-LA' / 'node_values.npy'
#view npy data
data = np.load(NODE_VALUES)
np.set_printoptions(suppress=True)
print(data[0].tolist())
print(len(data[0]))
print(len(data))
#adjacency Matrix distances flattened?
#npy formated examples from https://graphmining.ai/temporal_datasets/METR-LA.zip https://graphmining.ai/temporal_datasets/
ADJENCY_MATRIX = DATA / 'traffic_data' / 'METR-LA' / 'adj_mat.npy'
#view npy data
data = np.load(ADJENCY_MATRIX)
with np.printoptions(threshold=np.inf):
print(data[0].tolist())
print(len(data[0]))
print(len(data))
If you're using the numpy arrays in the npy example: https://pytorch-geometric-temporal.readthedocs.io/en/latest/_modules/torch_geometric_temporal/dataset/metr_la.html. The first array (node_values.npy) is simply the speed values from the mtr-la.h5 file.
The second array is the adjacency matrix (adj_mat.npy) as discussed in the article. It's created from the graph_sensor_ids.txt and distances_la_2012.csv. Then it's somehow flattened which I'm still trying to figure out.
import numpy as np from pathlib import Path try: import common DATA = common.dataDirectory() except ImportError: DATA = Path().resolve() / 'data' #Speeds from mtr-la.h5 https://drive.google.com/drive/folders/10FOTa6HXPqX8Pf5WRoRwcFnW9BrNZEIX view convert h5_to_csv.py #npy formated examples from https://graphmining.ai/temporal_datasets/METR-LA.zip https://graphmining.ai/temporal_datasets/ NODE_VALUES = DATA / 'traffic_data' / 'METR-LA' / 'node_values.npy' #view npy data data = np.load(NODE_VALUES) np.set_printoptions(suppress=True) print(data[0].tolist()) print(len(data[0])) print(len(data)) #adjacency Matrix distances flattened? #npy formated examples from https://graphmining.ai/temporal_datasets/METR-LA.zip https://graphmining.ai/temporal_datasets/ ADJENCY_MATRIX = DATA / 'traffic_data' / 'METR-LA' / 'adj_mat.npy' #view npy data data = np.load(ADJENCY_MATRIX) with np.printoptions(threshold=np.inf): print(data[0].tolist()) print(len(data[0])) print(len(data))
This PowerPoint has something to do with that: https://www.slideshare.net/chirantanGupta1/traffic-prediction-from-street-network-imagespptx
This could be helpful to convert to csv and read the h5 files. 207 detectors. Speed in 5min intervals
import pandas as pd import h5py #h5 file path filename = 'metr-la.h5' #read h5 file dataset = h5py.File(filename, 'r') #print the first unknown key in the h5 file print(dataset.keys()) #returns df #save the h5 file to csv using the first key df with pd.HDFStore(filename, 'r') as d: df = d.get('df') df.to_csv('metr-la.csv')
Speeds in metr-la.h5 file match the numpy array.
If you're using the numpy arrays in the npy example: https://pytorch-geometric-temporal.readthedocs.io/en/latest/_modules/torch_geometric_temporal/dataset/metr_la.html. The first array (node_values.npy) is simply the speed values from the mtr-la.h5 file. The second array is the adjacency matrix (adj_mat.npy) as discussed in the article. It's created from the graph_sensor_ids.txt and distances_la_2012.csv. Then it's somehow flattened which I'm still trying to figure out.
import numpy as np from pathlib import Path try: import common DATA = common.dataDirectory() except ImportError: DATA = Path().resolve() / 'data' #Speeds from mtr-la.h5 https://drive.google.com/drive/folders/10FOTa6HXPqX8Pf5WRoRwcFnW9BrNZEIX view convert h5_to_csv.py #npy formated examples from https://graphmining.ai/temporal_datasets/METR-LA.zip https://graphmining.ai/temporal_datasets/ NODE_VALUES = DATA / 'traffic_data' / 'METR-LA' / 'node_values.npy' #view npy data data = np.load(NODE_VALUES) np.set_printoptions(suppress=True) print(data[0].tolist()) print(len(data[0])) print(len(data)) #adjacency Matrix distances flattened? #npy formated examples from https://graphmining.ai/temporal_datasets/METR-LA.zip https://graphmining.ai/temporal_datasets/ ADJENCY_MATRIX = DATA / 'traffic_data' / 'METR-LA' / 'adj_mat.npy' #view npy data data = np.load(ADJENCY_MATRIX) with np.printoptions(threshold=np.inf): print(data[0].tolist()) print(len(data[0])) print(len(data))
This PowerPoint has something to do with that: https://www.slideshare.net/chirantanGupta1/traffic-prediction-from-street-network-imagespptx
And this repository is also related to the adjacency matrix: https://github.com/FelixOpolka/STGCN-PyTorch
Hi, where can I find information on how
distances_la_2012.csv
file is created? Are those distances between every single sensor with every single other one?
Probably just the distances between all two points put in matrix shape (207 Detectors X 207 Detectors), but it's related to the adjacency matrix found in: adj_mat.npy
from geopy.distance import geodesic
origin = (30.172705, 31.526725) # (latitude, longitude) don't confuse
dist = (30.288281, 31.732326)
print(geodesic(origin, dist).meters) # 23576.805481751613
print(geodesic(origin, dist).kilometers) # 23.576805481751613
print(geodesic(origin, dist).miles) # 14.64994773134371
For each detector node the nearest 12 detectors nodes are added into the adjacency matrix, the rest are filled with 0s. 1 is always the weight path to the current detector node in the list. Could maybe also use the k-nearest neighbor algorithm.? Idk.
Trying to reconstruct this example here because I have my own data from a different city: https://colab.research.google.com/drive/132hNQ0voOtTVk3I4scbD3lgmPTQub0KR?usp=sharing
Hi, where can I find information on how
distances_la_2012.csv
file is created? Are those distances between every single sensor with every single other one?
Dijkstra finds the optimal route between two detectors. https://github.com/liyaguang/DCRNN/issues/8#issuecomment-424170421
Hi, where can I find information on how
distances_la_2012.csv
file is created? Are those distances between every single sensor with every single other one?Dijkstra finds the optimal route between two detectors. #8 (comment)
An example with Dijkstra using OpenStreetMap. Output slightly different than with Google Maps. https://medium.com/p/2d97d4881996
import osmnx as ox
import networkx as nx
# define the start and end locations in latlng
ox.config(log_console=True, use_cache=True)
start_latlng = (34.14745, -118.37124) #Detector 717490
# location to where you want to find your route
end_latlng = (34.15497, -118.31829) #Detector 773869
# find shortest route based on the mode of travel
place = 'Los Angeles, California, United States'
# 'drive', 'bike', 'walk'# find shortest path based on distance or time
mode = 'drive'
# 'length','time'# create graph from OSM within the boundaries of some
optimizer = 'time'
# geocodable place(s)
graph = ox.graph_from_place(place, network_type = mode)
# find the nearest node to the start location
orig_node = ox.get_nearest_node(graph, start_latlng)
# find the nearest node to the end location
dest_node = ox.get_nearest_node(graph, end_latlng)
# find the shortest path
shortest_route = nx.shortest_path(graph,
orig_node,
dest_node,
weight=optimizer)
#find the shortest path method dijkstra or bellman-ford
shortest_route_distance = nx.shortest_path_length(graph, orig_node,dest_node,weight="length", method="dijkstra")
#distance between 717490 and 773869 with OpenStreetMap is 8252.298 and the original value in the dataset was 7647.0
print("Distance: " + str(shortest_route_distance))
这是来自QQ邮箱的自动回复邮件。 邮件已收到。
Hi, where can I find information on how
distances_la_2012.csv
file is created? Are those distances between every single sensor with every single other one?
Here's how I created my own distance matrix: https://github.com/ThomasAFink/osmnx_adjacency_matrix_for_graph_convolutional_networks
这是来自QQ邮箱的自动回复邮件。 邮件已收到。
这是来自QQ邮箱的自动回复邮件。 邮件已收到。
I have already found the data I need in other Repositories. Sorry to bother you. Thank you for your timely attention and wonderful work
---Original--- From: @.> Date: Wed, Jul 12, 2023 16:13 PM To: @.>; Cc: @.**@.>; Subject: Re: [liyaguang/DCRNN] How could I get METR-LA dataset? (#10)
这是来自QQ邮箱的自动回复邮件。 邮件已收到。 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
Hi Louis, you may get the dataset following instructions in the README.