mmcdermott EventStreamGPT issues

mmcdermott / EventStreamGPT

Dataset and modelling infrastructure for modelling "event streams": sequences of continuous time, multivariate events with complex internal dependencies.

https://eventstreamml.readthedocs.io/en/latest/

MIT License

102 stars 16 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Fixes polars update which modifies enable_string_cache and broke ACES dataloading

#120 justin13601 opened 1 month ago
1
build_dataset fails when aggregating timestamps into buckets

#119 juancq opened 2 months ago
0
Setting min_seq_len=1 in PytorchDatasetConfig with task dataframes leads to ragged tensors index error

#118 juancq opened 3 months ago
4
Test cases and (eventually) fixes for #114

#117 mmcdermott opened 4 months ago
2
Test cases and (eventually) fixes for #114

#116 mmcdermott closed 4 months ago
1
Some small changes to update things for more recent versions of packages.

#115 mmcdermott closed 4 months ago
1
Subject ID splits can get messed up if subjects are not simple int types.

#114 mmcdermott opened 4 months ago
4
Processing Synthetic Data with ESGPT

#113 sujaybanerjee opened 5 months ago
1
Task `to_int_index` does not pass tests with more recent versions of polars

#112 mmcdermott opened 5 months ago
0
Support control for how events are aggregated/combined into compound events

#111 juancq closed 5 months ago
1
end_time in task dataframes is not counted as an event (right open interval)

#110 juancq opened 5 months ago
1
Setting min_seq_len in PytorchDatasetConfig to < 2 with task dataframes leads to keyerror

#109 juancq closed 3 months ago
5
Could not override 'config.task_specific_params.pooling_method' when lauching wandb agent

#108 Rhett-Ying opened 5 months ago
3
Adjusted join in flat reps to account for different timestamps with t…

#107 pargaw opened 6 months ago
1
nvalid series dtype: expected `Utf8`, got `datetime[ns]`

#106 rvandewater closed 5 months ago
2
Add computation over future summary windows to flat reps

#105 mmcdermott closed 6 months ago
2
Nested ragged tensors integration does not respect `seq_padding_side`

#104 mmcdermott opened 6 months ago
1
Specify test set exclusions

#103 mmcdermott closed 6 months ago
1
Fixes a metrics thing and enables tuning subset sizes as well for speed.

#102 mmcdermott closed 6 months ago
1
Polars upgrade to 0.20+

#101 mmcdermott closed 6 months ago
1
Removed the brittle polars dataset test.

#100 mmcdermott closed 6 months ago
0
Split improperly shared parameters.

#99 mmcdermott closed 6 months ago
1
Added seeding to caching of data.

#98 mmcdermott closed 6 months ago
1
Removed pre-processors as there was only one option in use now and these will be phased out with MEDS

#97 mmcdermott closed 6 months ago
1
Getting started steps

#96 rvandewater opened 7 months ago
1
sklearn baseline pipe code should be using logger

#95 mmcdermott opened 7 months ago
0
Fixed a doc mismatch and a scripts typo.

#94 mmcdermott closed 7 months ago
1
May be able to make temporal loss more stable by directly computing log-prob from pre-activation function outputs

#93 mmcdermott opened 9 months ago
2
Dev

#92 payalchandak closed 10 months ago
0
Should be possible to resume pre-training and evaluate a pre-trained model post-hoc

#91 mmcdermott opened 10 months ago
0
Fixes the slowdowns and bugs caused by the prior improved compute practices, but requires a nested tensor package.

#90 mmcdermott closed 5 months ago
12
Merging recent changes into main

#89 mmcdermott closed 11 months ago
0
Force foreign-key constraints among dfs

#88 Jwoo5 closed 11 months ago
0
Updated some deprecated polars functions.

#87 mmcdermott closed 11 months ago
0
Nested caching

#86 pargaw closed 11 months ago
0
Adds support and a script for exporting an ESDS dataset from an ESGPT dataset (with support for modifier columns).

#85 mmcdermott closed 11 months ago
0
v0.5 changes: logging, computational improvements, and more

#84 mmcdermott closed 11 months ago
1
Added logging to other aspects of ESGPT.

#83 mmcdermott closed 11 months ago
0
Remove the `with_row_count` to create new event IDs to speed up dataset creation and permit lazy frames to be used longer.

#82 mmcdermott closed 11 months ago
0
Remove unnecessary subject ID conversion to streamline dataset creation.

#81 mmcdermott closed 11 months ago
0
Proper logs

#80 mmcdermott closed 11 months ago
0
Enable flash attention

#79 mmcdermott closed 11 months ago
0
Removed the old increment calls to assign event IDs in favor of hashes of subject IDs and timestamps which can be run lazily.

#78 mmcdermott closed 11 months ago
0
Duplication bug in task caching

#77 mmcdermott closed 12 months ago
0
Improve all aspects of compute performance (save disk space cost) for pytorch datasets by pre-caching processed items.

#76 mmcdermott closed 11 months ago
10
PytorchDataset doesn't have a way to extract subject_ids, start_time, and end_time all together

#75 pargaw opened 1 year ago
0
Replace use of list for cached_data and subject_ids in pytorch dataset with polars objects

#74 juancq closed 1 year ago
3
DataLoader with num_workers > 0 increases memory consumption over time

#73 juancq opened 1 year ago
2
Misspelled measurement_configs in transformer/config.py

#72 juancq closed 1 year ago
1
Fixed mutable defaults so that the tests do not error on py311.

#71 mmcdermott closed 1 year ago
1