sunlabuiuc / PyHealth

A Deep Learning Python Toolkit for Healthcare Applications.
https://pyhealth.readthedocs.io
MIT License
956 stars 207 forks source link

Does models support text? #271

Closed bbb801 closed 7 months ago

bbb801 commented 7 months ago

Dear Sir or Madam,

I have input text report as features into the transformer and it works. But I dont know if it is meaningful to do so. If it can learn from the text as language models do?

I follow the case 2 for the transformer model, each code is a radiology report.

"case 2. [[code1, code2]] or [[code1, code2], [code3, code4, code5], …]"

ycq091044 commented 7 months ago

Hi, thanks for you message. There are two options:

  1. You could tokenize your radiology report (such as each word as a token/code) and then use the transformer model. Our model will automatically learn the word embeddings.
  2. You could also pre-load some existing word2vec embeddings and provide the embeddings as codes to input into the transformer model.
bbb801 commented 7 months ago

Thank you. You have provided valuable advice.

获取 Outlook for Androidhttps://aka.ms/ghei36


From: Chaoqi Yang @.> Sent: Sunday, February 25, 2024 4:38:34 AM To: sunlabuiuc/PyHealth @.> Cc: LIU Jundong @.>; Author @.> Subject: [Ext] Re: [sunlabuiuc/PyHealth] Does models support text? (Issue #271)

CAUTION: External email. Do not reply, click on links or open attachments unless you recognize the sender and know the content is safe.

Hi, thanks for you message. There are two options:

  1. You could tokenize your radiology report (such as each word as a token/code) and then use the transformer model. Our model will automatically learn the word embeddings.
  2. You could also pre-load some existing word2vec embeddings and provide the embeddings as codes to input into the transformer model.

― Reply to this email directly, view it on GitHubhttps://github.com/sunlabuiuc/PyHealth/issues/271#issuecomment-1962725819, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMPCDZMIG6IRQW5IBXKYN3LYVJFUVAVCNFSM6AAAAABDUFCJ7WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRSG4ZDKOBRHE. You are receiving this because you authored the thread.Message ID: @.***>

bbb801 commented 7 months ago

Dear Dr. Yang,

Do you mean I should convert the text report into vectors, and then input the vectors as features into the Pyhealth Transformer model? Or I just directly input the text into the Pyhealth Transformer model?

Regards, Jundong 获取 Outlook for Androidhttps://aka.ms/ghei36


From: Chaoqi Yang @.> Sent: Sunday, February 25, 2024 4:38:34 AM To: sunlabuiuc/PyHealth @.> Cc: LIU Jundong @.>; Author @.> Subject: [Ext] Re: [sunlabuiuc/PyHealth] Does models support text? (Issue #271)

CAUTION: External email. Do not reply, click on links or open attachments unless you recognize the sender and know the content is safe.

Hi, thanks for you message. There are two options:

  1. You could tokenize your radiology report (such as each word as a token/code) and then use the transformer model. Our model will automatically learn the word embeddings.
  2. You could also pre-load some existing word2vec embeddings and provide the embeddings as codes to input into the transformer model.

― Reply to this email directly, view it on GitHubhttps://github.com/sunlabuiuc/PyHealth/issues/271#issuecomment-1962725819, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMPCDZMIG6IRQW5IBXKYN3LYVJFUVAVCNFSM6AAAAABDUFCJ7WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRSG4ZDKOBRHE. You are receiving this because you authored the thread.Message ID: @.***>

ycq091044 commented 7 months ago

Yes, you could "convert the text report into vectors, and then input the vectors as features into the Pyhealth Transformer model".

Or, you could tokenize the text into words, and our model will automatically learn word embeddings (however, the training will be slower due to large word vocab)

bbb801 commented 7 months ago

Thank you very much.

获取 Outlook for Androidhttps://aka.ms/ghei36


From: Chaoqi Yang @.> Sent: Tuesday, February 27, 2024 9:44:05 AM To: sunlabuiuc/PyHealth @.> Cc: LIU Jundong @.>; Author @.> Subject: [Ext] Re: [sunlabuiuc/PyHealth] Does models support text? (Issue #271)

CAUTION: External email. Do not reply, click on links or open attachments unless you recognize the sender and know the content is safe.

Yes, you could "convert the text report into vectors, and then input the vectors as features into the Pyhealth Transformer model".

Or, you could tokenize the text into words, and our model will automatically learn word embeddings (however, the training will be slower due to large word vocab)

― Reply to this email directly, view it on GitHubhttps://github.com/sunlabuiuc/PyHealth/issues/271#issuecomment-1965640164, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMPCDZMVHPNDKGPB44ALEO3YVU26LAVCNFSM6AAAAABDUFCJ7WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRVGY2DAMJWGQ. You are receiving this because you authored the thread.Message ID: @.***>