mtopaz / NimbleMiner

NimbleMiner: a software that allows users to interact with word embedding to rapidly create lexicons of similar terms, conduct weakly supervised labeling, and implement text mining
GNU General Public License v2.0
20 stars 15 forks source link

any tutorials on the UI? #19

Open bhomass opened 1 year ago

bhomass commented 1 year ago

I have NimbleMiner installed, but I can't figure out what to do with the UI features. The very first tab is for category. But no where in the paper did the term category come up. Same is true with the 3rd tab, simclin. I don't find any reference to that either. Is there any blogs on how to get started with the software? I would really appreciate it.

paritoshk commented 1 year ago

We have a paper coming up on that, simclins are specific n grams to detect and category is column is relevant for an indication only when you have a EHR data called note (clinical note)

On Fri, Sep 2, 2022 at 4:04 PM Bruce Ho @.***> wrote:

I have NimbleMiner installed, but I can't figure out what to do with the UI features. The very first tab is for category. But no where in the paper did the term category come up. Same is true with the 3rd tab, simclin. I don't find any reference to that either. Is there any blogs on how to get started with the software? I would really appreciate it.

— Reply to this email directly, view it on GitHub https://github.com/mtopaz/NimbleMiner/issues/19, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHXE27EXILX3XVV5BH2SHTV4KBZRANCNFSM6AAAAAAQDURUMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Paritosh (Paree-Tosh) Kulkarni,M.S Data Scientist Pivotal Life Sciences 501 Second Street Suite 200 San Francisco, CA 94107

(646)-321-2902 www.nanfunglifesciences.com Book a coffee chat -https://calendly.com/paritoshk/30min https://www.linkedin.com/in/paritoshkul/

bhomass commented 1 year ago

even routine processing of the app is unclear. I clicked on the "Build word2vec model" button, and the web page goes shaded for hours. I have to refresh the page to remove the shading. And I have no idea what kind of processing was done, or whether there are errors. As far as I can tell, a simclins_tree.csv file was generated, which is a tiny file. is that the expected output for the word2vec run? I thought I would get a dictionary of word vectors.

paritoshk commented 1 year ago

Bruce,

The app works fine for me. It does require extensive R and NLP experience to understand.

First of all - What are you trying to do? Secondly - What dataset are you using?

App is used for hypothesis testing and feature detection in natural language.

For the app to work - data must be in formated in table with column names -

index | note

Notes must be clean.

Secondly you have to supply two lists of simclins (the keywords that will mark the note positive and keywords that will mark the note negative)

In the end you will get

P|FP|FN|N (positive, false positive...) set of notes

Unless you are doing all the above the app will not work.

Best, Paritosh

On Sat, Sep 17, 2022 at 10:19 AM Bruce Ho @.***> wrote:

even routine processing of the app is unclear. I clicked on the "Build word2vec model" button, and the web page goes shaded for hours. I have to refresh the page to remove the shading. And I have no idea what kind of processing was done, or whether there are errors.

— Reply to this email directly, view it on GitHub https://github.com/mtopaz/NimbleMiner/issues/19#issuecomment-1250108779, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHXE22YOFL6X5VOWKRBSSDV6X4R7ANCNFSM6AAAAAAQDURUMY . You are receiving this because you commented.Message ID: @.***>

-- Paritosh (Paree-Tosh) Kulkarni,M.S Data Scientist Pivotal Life Sciences 501 Second Street Suite 200 San Francisco, CA 94107

(646)-321-2902 www.nanfunglifesciences.com Book a coffee chat -https://calendly.com/paritoshk/30min https://www.linkedin.com/in/paritoshkul/

bhomass commented 1 year ago

I am an expert in NLP. I am more a python coder, but I can read R just fine. Before I study your entire code, I just want to get it running using your data. The requisite train.txt is already in the directory, along with many files named simclins.csv, and simclins_tree.csv. I suppose that is sufficient to at least see the application run? My question is specific to how you designed the application. What do I look for either on the UI, or dig into the file system.

alaabashayreh commented 1 year ago

Bruce- The PDF article included in the code provides a good description of how NimbleMiner works and served as a guideline for me. I used the system to generate lexicons (aka lists of simclins) based on word embeddings and for rule-based text classification. It worked just fine with some tweaking needed when training the embeddings model.

bhomass commented 1 year ago

There is a pdf in the code? where? I spent 10 minutes looking for it in the repo. there is not a single pdf file.

alaabashayreh commented 1 year ago

Here is the link: https://github.com/mtopaz/NimbleMiner/blob/master/1-s2.0-S1532046419300218-main.pdf

bhomass commented 1 year ago

I have read that paper thoroughly. I understand all about the algorithm behind it and even have some suggestions. My question is on documentation of the application UI. How did you figure out how to use it? did you read through the R / shiny code?

bhomass commented 1 year ago

I have been struggling getting this to work for over a month. When doing R CMD INSTALL maxent.tar.gz, I get a ./maxent.h:24:10: fatal error: 'tr1/unordered_map' file not found

include <tr1/unordered_map>

We installed it on an Ubuntu server, the app does come up, but if I click on anything, like the +Add button for entering new simclin, the page grays out without adding anything.

You all never saw this problem while installing this lib?

I can see how it could have just worked. Two of my colleagues tried installation on their Mac, and both worked right away. This only happens on my Mac.

bhomass commented 1 year ago

My issues is my Mac is way too old to install the latest XCode packages. So, it is only a problem on my end. I will find other means to get this running and test it out. Thank you all.

bhomass commented 1 year ago

We installed the repo on a ubuntu server and accessed it remotely. Now I have a working instance. And thanks for the pdf document reference, I discovered there actually is documentation on the UI, which is the paper's appendix.

https://ars.els-cdn.com/content/image/1-s2.0-S1532046419300218-mmc1.docx

This was my very first question way on top. I answer my own question here for future inquirers.