Closed Zethson closed 2 years ago
Was merged, but no release yet.
Add Github tag with https://github.com/python-poetry/poetry/issues/313
I think that free text should not be part of X in its raw form. I'd add the free text to obs only and allow for an embedding column/matrix to be appended to X after MedCAT was applied.
I think that free text should not be part of X in its raw form. I'd add the free text to obs only and allow for an embedding column/matrix to be appended to X after MedCAT was applied.
Yes, I will take care of implementing an "autodetect" feature for this, so users are not forced to pass every free text column for obs only when creating the MuData/AnnData object.
1.2.6 was released and we should be ready to implement this now.
Rewrote our current implementation to work with the latest MedCAT. Think this still requires a redesign.
As discussed: Keep a "main" MedCat object, so we do not loose any results.
Add a function to nicely display such an object, for easier navigation by the user.
Add pp functions to filter the object for specific values (like tui, cui, type of disease, symptoms, etc)
Add a function that can return a binary column based on user filtering (e.g. which row contains for example pulmonary diseases). So the actual values (which might be multiple values in one row) never need to be actually stored in the AnnData object, only indicators when they are needed.
Add a decorator or overwrite plotting functions like umap, pca etc, for example when coloring by pulmonary disease (y/n). This column might not be present in X or obs, so add it using the way described right above.
Did I miss something @Zethson ?
No, sounds great. Feel free to show early drafts so that we can evaluate our approach before doubling down.
Thank you!
To extract keywords from free text notes we will be integrating MedCAT.
The goals are as follows:
Tasks in somewhat reasonable order