issues
search
mmcdermott
/
EventStreamGPT
Dataset and modelling infrastructure for modelling "event streams": sequences of continuous time, multivariate events with complex internal dependencies.
https://eventstreamml.readthedocs.io/en/latest/
MIT License
102
stars
16
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Fixes polars update which modifies enable_string_cache and broke ACES dataloading
#120
justin13601
opened
1 month ago
1
build_dataset fails when aggregating timestamps into buckets
#119
juancq
opened
2 months ago
0
Setting min_seq_len=1 in PytorchDatasetConfig with task dataframes leads to ragged tensors index error
#118
juancq
opened
3 months ago
4
Test cases and (eventually) fixes for #114
#117
mmcdermott
opened
4 months ago
2
Test cases and (eventually) fixes for #114
#116
mmcdermott
closed
4 months ago
1
Some small changes to update things for more recent versions of packages.
#115
mmcdermott
closed
4 months ago
1
Subject ID splits can get messed up if subjects are not simple int types.
#114
mmcdermott
opened
4 months ago
4
Processing Synthetic Data with ESGPT
#113
sujaybanerjee
opened
5 months ago
1
Task `to_int_index` does not pass tests with more recent versions of polars
#112
mmcdermott
opened
5 months ago
0
Support control for how events are aggregated/combined into compound events
#111
juancq
closed
5 months ago
1
end_time in task dataframes is not counted as an event (right open interval)
#110
juancq
opened
5 months ago
1
Setting min_seq_len in PytorchDatasetConfig to < 2 with task dataframes leads to keyerror
#109
juancq
closed
3 months ago
5
Could not override 'config.task_specific_params.pooling_method' when lauching wandb agent
#108
Rhett-Ying
opened
5 months ago
3
Adjusted join in flat reps to account for different timestamps with t…
#107
pargaw
opened
6 months ago
1
nvalid series dtype: expected `Utf8`, got `datetime[ns]`
#106
rvandewater
closed
5 months ago
2
Add computation over future summary windows to flat reps
#105
mmcdermott
closed
6 months ago
2
Nested ragged tensors integration does not respect `seq_padding_side`
#104
mmcdermott
opened
6 months ago
1
Specify test set exclusions
#103
mmcdermott
closed
6 months ago
1
Fixes a metrics thing and enables tuning subset sizes as well for speed.
#102
mmcdermott
closed
6 months ago
1
Polars upgrade to 0.20+
#101
mmcdermott
closed
6 months ago
1
Removed the brittle polars dataset test.
#100
mmcdermott
closed
6 months ago
0
Split improperly shared parameters.
#99
mmcdermott
closed
6 months ago
1
Added seeding to caching of data.
#98
mmcdermott
closed
6 months ago
1
Removed pre-processors as there was only one option in use now and these will be phased out with MEDS
#97
mmcdermott
closed
6 months ago
1
Getting started steps
#96
rvandewater
opened
7 months ago
1
sklearn baseline pipe code should be using logger
#95
mmcdermott
opened
7 months ago
0
Fixed a doc mismatch and a scripts typo.
#94
mmcdermott
closed
7 months ago
1
May be able to make temporal loss more stable by directly computing log-prob from pre-activation function outputs
#93
mmcdermott
opened
9 months ago
2
Dev
#92
payalchandak
closed
10 months ago
0
Should be possible to resume pre-training and evaluate a pre-trained model post-hoc
#91
mmcdermott
opened
10 months ago
0
Fixes the slowdowns and bugs caused by the prior improved compute practices, but requires a nested tensor package.
#90
mmcdermott
closed
5 months ago
12
Merging recent changes into main
#89
mmcdermott
closed
11 months ago
0
Force foreign-key constraints among dfs
#88
Jwoo5
closed
11 months ago
0
Updated some deprecated polars functions.
#87
mmcdermott
closed
11 months ago
0
Nested caching
#86
pargaw
closed
11 months ago
0
Adds support and a script for exporting an ESDS dataset from an ESGPT dataset (with support for modifier columns).
#85
mmcdermott
closed
11 months ago
0
v0.5 changes: logging, computational improvements, and more
#84
mmcdermott
closed
11 months ago
1
Added logging to other aspects of ESGPT.
#83
mmcdermott
closed
11 months ago
0
Remove the `with_row_count` to create new event IDs to speed up dataset creation and permit lazy frames to be used longer.
#82
mmcdermott
closed
11 months ago
0
Remove unnecessary subject ID conversion to streamline dataset creation.
#81
mmcdermott
closed
11 months ago
0
Proper logs
#80
mmcdermott
closed
11 months ago
0
Enable flash attention
#79
mmcdermott
closed
11 months ago
0
Removed the old increment calls to assign event IDs in favor of hashes of subject IDs and timestamps which can be run lazily.
#78
mmcdermott
closed
11 months ago
0
Duplication bug in task caching
#77
mmcdermott
closed
12 months ago
0
Improve all aspects of compute performance (save disk space cost) for pytorch datasets by pre-caching processed items.
#76
mmcdermott
closed
11 months ago
10
PytorchDataset doesn't have a way to extract subject_ids, start_time, and end_time all together
#75
pargaw
opened
1 year ago
0
Replace use of list for cached_data and subject_ids in pytorch dataset with polars objects
#74
juancq
closed
1 year ago
3
DataLoader with num_workers > 0 increases memory consumption over time
#73
juancq
opened
1 year ago
2
Misspelled measurement_configs in transformer/config.py
#72
juancq
closed
1 year ago
1
Fixed mutable defaults so that the tests do not error on py311.
#71
mmcdermott
closed
1 year ago
1
Next