LibCity / Bigscity-LibCity

LibCity: An Open Library for Urban Spatial-temporal Data Mining
https://libcity.ai/
Apache License 2.0
916 stars 168 forks source link

pipeline about running STAN #297

Closed bb67ao closed 2 years ago

bb67ao commented 2 years ago

I want to know how to orginize the dataset(the result after your methed(dyna,geo,usr)) to satisfy STAN,but i didn't find the code in this lib.So could you pls give me a guideline

WenMellors commented 2 years ago

When I run the STAN, I find a little bug that caused by our recent commit. Now, I fix it in #298 and test STAN on Foursquare-NYC dataset. I am not sure if this bug is confusing you. As For the code which is responsible for processing the dataset (dyna, geo, usr), it is the TrajectoryDataset and the StanEncoder.

The main flow of our trajectory data processing is as follows:

  1. We cut and filter the trajectories from the original check-in records in '.dyna', which is implemented by 'TrajectoryDataset.cutter_filter'.
  2. We then encode the processed trajectories into the model's input (also called the feature or tensor), which is implemented by 'TrajectoryDataset.encode_traj'. Since different models require different features, we implement different trajectory encoders for different models. As for STAN, the 'encode_traj' method will invoke 'StanEncoder.encode' to extract the features.
  3. We divide the encoded input into the train set, the validate set, and the test set, and then form the Batch.

I'm sorry that I'm just a maintainer of this lib, not the original reproducer of this model. Hope the above can help you.