Last Cleansing - Githubissues

bptlab / TracEX

This bachelorproject focuses on event log extraction from patient journeys using large-language models.

5 stars 5 forks source link

Last Cleansing #139

Closed nils-schmitt closed 5 months ago

nils-schmitt commented 6 months ago

Here we should collect all the things, we see in the code check that are most certainly in the end product, but not desired. Feel free to contribute!

Database:

[x] Delete all admin users (except "admin")
[x] Clean the DB one last time of unwanted PJs / Traces / Cohorts
[x] Traces with cohorts
[x] all traces have valid cohorts

Code:

[x] Universal case convention for buttons
[x] universal capitalization convention for html files
[x] Remove all print()
[x] check settings.py
[x] remove migrations from comparator
[x] delete tests
[x] activity_labeler 76: false typing
[x] clean post_processing
[x] Snomed integration

Optional:

[x] reset ids?

PitButtchereit commented 6 months ago

Buttons should have a universal case convention.

nils-schmitt commented 6 months ago

module_metrics_analyzer.__rate_timestamp_correctness should use the liear_prob directly from the query_gpt() function

PitButtchereit commented 6 months ago

module_metrics_analyzer.__rate_timestamp_correctness should use the liear_prob directly from the query_gpt() function

I've refactored this already on my branch 👍 New code now looks like this.

timestamp_correctness, linear_probability = u.query_gpt(messages, return_linear_probability=True, top_logprobs=1) and I changed the return value of the function accordingly.

PitButtchereit commented 6 months ago

This basically means having a final database image.

nils-schmitt commented 6 months ago

Remove all print statements

PitButtchereit commented 6 months ago

Fix ResultViewTests

FR-SON commented 6 months ago

add valid cohorts for all traces remaining in the db

FR-SON commented 6 months ago

In settings.py there are these two lines that should be addressed:

# SECURITY WARNING: keep the secret key used in production secret!
SECRET_KEY = "django-insecure-$u00r=^xd*m1ggjgzwj%2o2$h=34k358#imaxe22w@stk_aptt"

# SECURITY WARNING: don't run with debug turned on in production!
DEBUG = True

FR-SON commented 6 months ago

Remove migrations directory from trace:comparator. It is empty aside from an init file so it should not be required. Especially because all models are located in the extraction app.

nils-schmitt commented 6 months ago

Push web scraping poc, with big future use annotation

nils-schmitt commented 6 months ago

Fix ResultViewTests

How thou???

PitButtchereit commented 6 months ago

Fix ResultViewTests

How thou???

Delete

nils-schmitt commented 6 months ago

Clean TTE html files

nils-schmitt commented 6 months ago

Make html files similar regarding Capitalization of words

FR-SON commented 6 months ago

Line 76 in module_activity_labler.py: user_message: List[str] = patient_journey_numberedhas wrong typing, should be str.

The functions inside post_processing in module_time_extractor.py use a variable df that shadows a variable from outer scope. It should be changed to _df inside those functions. Additionally the column parameter is typed as a pd.Series, but is actually a str

tkv29 commented 6 months ago