Open Chethan-Babu-stack opened 3 years ago
Hi, there's some support for multi-variate time series forecasting in GCN-LSTM, implemented in https://github.com/stellargraph/stellargraph/pull/1580. Unfortunately it's still very early days for this support:
I think you may be able to get something to work with (following your example):
StellarGraph
as a tensor of shape [number of nodes, number of time steps, number of observations per node per time step (2 in this case)]. You can see one way to do this at https://stellargraph.readthedocs.io/en/stable/demos/basics/loading-numpy.html#Homogeneous-graph-with-non-numeric-IDs-and-feature-tensorsGCN_LSTM
and SlidingFeaturesNodeGenerator
x[..., 0]
data, and ignores the x[..., 1]
data)You may've already seen the example doing univariate (that is, one observation per node per time step) prediction at https://stellargraph.readthedocs.io/en/stable/demos/time-series/gcn-lstm-time-series.html, which might be a good place to start (although it unfortunately doesn't use SlidingFeaturesNodeGenerator
yet).
Does that help get you started?
Hi, Thanks a lot for your response.
I tried as per your suggestion. I get an error as below when I try to train the model.
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name) 58 ctx.ensure_initialized() 59 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, ---> 60 inputs, attrs, num_outputs) 61 except core._NotOkStatusException as e: 62 if name is not None:
UnimplementedError: Cast string to float is not supported
[[node model/Cast (defined at
Do you have any suggestions on how I can overcome this?
Thanks again :)
I'm glad you're able to make a little bit of progress! Unfortunately that's not nearly enough information for me to help. People like me can help you better in questions/bug reports if you include:
From the error message it sounds like you have string values somewhere numbers are expected, so the first thing I would do is print out information about types in various numpy arrays and/or TensorFlow tensors (for example, print(some_tensor.dtype)
) and StellarGraph graphs (for example, print(some_stellargraph.info())
).
Hi, I really appreciate your help.
I have attached a google colab with the code changes I have made so far.
Also, attached are two datasets that have to be loaded in the 7th step of Google colab. The dataset has two CSV files, one is the adjacency matrix and the other is the feature matrix of time series data.
As suggested by you I have created numpy arrays in 16th step(method: sequence_data_preparation) of Google colab. The issue is shown in model.fit
Hope this would explain it in detail.
Thanks once again. Multivariate time series TGCN.zip
Great. Unfortunately I'm busy today and so won't be able to get to it until Monday next week (I've set a reminder). Please leave an update if you debug anything more in the mean time.
Hi @Chethan-Babu-stack, unfortunately I couldn't reproduce the issue you described above. I tried running it per https://colab.research.google.com/drive/10IlIlDVJxUARoWBh4ZfFO-Ph1AAcdotr?usp=sharing (I inlined the datasets so that the notebook is standalone), and saw:
WARNING:tensorflow:Model was constructed with shape (None, 10, 10) for input KerasTensor(type_spec=TensorSpec(shape=(None, 10, 10), dtype=tf.float32, name='input_9'), name='input_9', description="created by layer 'input_9'"), but it was called on an input with incompatible shape (None, 10, 10, 3).
...
ValueError: in user code:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:805 train_function *
return step_function(self, iterator)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:795 step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:1259 run
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2730 call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:3417 _call_for_each_replica
return fn(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:788 run_step **
outputs = model.train_step(data)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:754 train_step
y_pred = self(x, training=True)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py:1012 __call__
outputs = call_fn(inputs, *args, **kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/functional.py:425 call
inputs, training=training, mask=mask)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/functional.py:560 _run_internal_graph
outputs = node.layer(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py:1012 __call__
outputs = call_fn(inputs, *args, **kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/core.py:557 call
result.set_shape(self.compute_output_shape(inputs.shape))
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/core.py:548 compute_output_shape
self.target_shape)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/core.py:536 _fix_unknown_dimension
raise ValueError(msg)
ValueError: total size of new array must be unchanged, input_shape = [10, 10, 3, 1], output_shape = [10, 10]
Please let me know if this is the error you're struggling with, or provide an updated link that reproduces the problem (see https://research.google.com/colaboratory/faq.html "Where are my notebooks stored, and can I share them?" for details).
Hi @huonw Yes, this the point where I'm stuck now(the ValueError: total size of new array must be unchanged, input_shape = [10, 10, 3, 1], output_shape = [10, 10]). It is fine to share it :) Thanks, Chethan
Ah, I really would've appreciated if you'd noted that the error was different to the one in https://github.com/stellargraph/stellargraph/issues/1852#issuecomment-776763770, because it means I can help you better.
As I hinted above, the only way to do multi-variate GCN-LSTM at the moment is using SlidingFeaturesNodeGenerator
. You'll need to turn the speed_data
data frame into an IndexedArray
https://stellargraph.readthedocs.io/en/stable/api.html#stellargraph.IndexedArray of shape 10 (for the nodes) × 699 (for the time stamps) × 3 (for the observations), load this into a StellarGraph
and use SlidingFeaturesNodeGenerator
For instance:
import json
# convert nested strings to one big numpy array
speed_array = np.array([[json.loads(s) for s in row] for row in speed_data.to_numpy()], dtype=float)
print(speed_array.shape) # (10, 699, 3)
node_features = sg.IndexedArray(speed_array, index=speed_data.index)
train_size = ... # number of samples to use for training
graph = StellarGraph(node_features, ...)
generator = SlidingFeaturesNodeGenerator(graph, 10, batch_size=60)
train_gen = generator.flow(slice(0, train_size), target_distance=1)
test_gen = generator.flow(slice(train_size, None), target_distance=1)
gcn_lstm = GCN_LSTM(None, None, ..., generator=generator)
...
model.fit(train_gen, validation_data=test_gen)
...
Unfortunately there's no example of this, but you can see how the tests do it at https://github.com/stellargraph/stellargraph/blob/1e6120fcdbbedd3eb58e8fecc0eabc6999101ee6/tests/layer/test_gcn_lstm.py#L192-L217
Another option would be to add a variates
argument to the GCN_LSTM
constructor, to allow the manual version to do multi-variate prediction too.
I will merge a pull request that:
variates=None
argumentelse
branchDoes that clarify things?
Hi @huonw, referencing your above comments, I have created a two-output model from data with dimensions 18 nodes by 216000 time stamps by 2 output observations (1 continuous, 1 categorical).
The resulting model is as follows:
The last reshape layer has dimensions (None, 18, 2), with the 2 reflecting the continuous (,,0) and categorical (,,1) outputs.
I would like to define different loss functions (i.e., mae and binary_crossentropy) for these two outputs. I could not seem to find a way to call each of these two dimensions individually when writing the loss functions in model.compile.
Any suggestions?
Ah, I really would've appreciated if you'd noted that the error was different to the one in #1852 (comment), because it means I can help you better.
As I hinted above, the only way to do multi-variate GCN-LSTM at the moment is using
SlidingFeaturesNodeGenerator
. You'll need to turn thespeed_data
data frame into anIndexedArray
https://stellargraph.readthedocs.io/en/stable/api.html#stellargraph.IndexedArray of shape 10 (for the nodes) × 699 (for the time stamps) × 3 (for the observations), load this into aStellarGraph
and useSlidingFeaturesNodeGenerator
For instance:
import json # convert nested strings to one big numpy array speed_array = np.array([[json.loads(s) for s in row] for row in speed_data.to_numpy()], dtype=float) print(speed_array.shape) # (10, 699, 3) node_features = sg.IndexedArray(speed_array, index=speed_data.index) train_size = ... # number of samples to use for training graph = StellarGraph(node_features, ...) generator = SlidingFeaturesNodeGenerator(graph, 10, batch_size=60) train_gen = generator.flow(slice(0, train_size), target_distance=1) test_gen = generator.flow(slice(train_size, None), target_distance=1) gcn_lstm = GCN_LSTM(None, None, ..., generator=generator) ... model.fit(train_gen, validation_data=test_gen) ...
Unfortunately there's no example of this, but you can see how the tests do it at
Another option would be to add a
variates
argument to theGCN_LSTM
constructor, to allow the manual version to do multi-variate prediction too.I will merge a pull request that:
- adds a
variates=None
argument- removes the
else
branch- updates the documentation
- (optionally) adds a test
Does that clarify things?
InvalidArgumentError Traceback (most recent call last) Cell In[370], line 1 ----> 1 model.fit(train_gen, validation_data=test_gen)
File ~\Anaconda3\envs\tensorflow2_p3_8\lib\site-packages\keras\utils\traceback_utils.py:70, in filter_traceback.tf.debugging.disable_traceback_filtering()
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
File ~\Anaconda3\envs\tensorflow2_p3_8\lib\site-packages\tensorflow\python\eager\execute.py:52, in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name) 50 try: 51 ctx.ensure_initialized() ---> 52 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, 53 inputs, attrs, num_outputs) 54 except core._NotOkStatusException as e: 55 if name is not None:
InvalidArgumentError: Graph execution error:
Detected at node 'mean_absolute_error/remove_squeezable_dimensions/Squeeze' defined at (most recent call last):
File "C:\Users\islam70\Anaconda3\envs\tensorflow2_p3_8\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\islam70\Anaconda3\envs\tensorflow2_p3_8\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\islam70\Anaconda3\envs\tensorflow2_p3_8\lib\site-packages\ipykernel_launcher.py", line 17, in
I have come up with a solution for the situation of multivariable input
In the code, I want to use the same adjacency matrix data for graph and change the speed dataset to have [speed, covid_cases_in that_place_at_that_time].
For instance, to check the traffic flow with speed and covid cases.
Please suggest how I could do this.
PS: I was thinking to encode two values into one and then use the same code. I'm not sure how exactly I can encode, may be using some weights to both.