Dataset and modelling infrastructure for modelling "event streams": sequences of continuous time, multivariate events with complex internal dependencies.
I was working with the examine_synthetic_data notebook to get started, and I wanted to clarify/point out a few things that could potentially be improved:
The guide does not mention it seems to work only for linux (or unix/osx) given the use of environment variables (not bad, but maybe good to clarify)
Perhaps also mention that you need to supply environment variables: $PROJECT_DIR, $PYTHON_PATH
The first line in build_dataset.py points to a default Python location, which is not what most people would be using if they use virtual environments. It could also be a mistake on my end.
Perhaps consider splitting up the Jupyter Notebook (and the read the docs) into a tutorial with multiple parts and/or making the file a little less verbose, which could help users get started. This might especially help interdisciplinary folks. I was a bit overwhelmed, and it made getting started daunting. I could also help you once I get more familiar with the pipeline.
The package looks great so far, and it is clear you have put a lot of thought into everything :).
Hi!
I was working with the
examine_synthetic_data
notebook to get started, and I wanted to clarify/point out a few things that could potentially be improved:Perhaps consider splitting up the Jupyter Notebook (and the read the docs) into a tutorial with multiple parts and/or making the file a little less verbose, which could help users get started. This might especially help interdisciplinary folks. I was a bit overwhelmed, and it made getting started daunting. I could also help you once I get more familiar with the pipeline.
The package looks great so far, and it is clear you have put a lot of thought into everything :).