OHDSI / Aphrodite

[in development]
Apache License 2.0
37 stars 15 forks source link

FAQ's - APHRODITE functions? #9

Closed SSMK-wq closed 4 years ago

SSMK-wq commented 4 years ago

Hello @jmbanda ,

I was referring the APHRODITE manual and came across this func called buildFeatureVector?

May I kindly request you to help me understand what does the below statemeant mean?

"Returns a patient feature vector (divided by feature sets)."

I understand each patient can have multiple features based on the number of drug concepts, visit concepts, lab concepts etc.

But may I know what does divide by feature set mean here?

Can you explain this with an example please? So that I can clearly understand what APHRODITE does under the hood and learn to use it in the correct way?

jmbanda commented 4 years ago

The feature vector is basically all of the patient's features like you mention, and it is divided by feature types. In other words, what I meant to say here is that first it has all the conditions, then all the drugs, then labs, etc. (or whatever combination of those you select). In row representation you have:

PatientID | drug1 | drug2 | drug3 |....... |lab 1| lab 2| lab 3 | ....... |procedure 1| procedure 2| procedure 3|

It just orders them that way.

Hope this helps.

SSMK-wq commented 4 years ago

Hi @jmbanda,

Yes, I get the feature representation as shown in row form like below

PatientID | drug1 | drug2 | drug3 |....... |lab 1| lab 2| lab 3 | ....... |procedure 1| procedure 2| procedure 3|

But what does dividing by "Feature Types" to the above row do here? 2 questions here. I did refer the manual and see that this term is frequently used for several other functions as well. So its helpful for me to know from you on the below items

a) What are Feature Types? Are different domains called Feature Types? ex: Lab, Obs, Drug etc. I did refer the manual but I see "Feature Types" . Just wish to confirm my understanding. Can help me understand this better? b) What does dividing the row representation shown above by feature type do? for ex: Let's say a patient "Jack" has drug A, B and C in his data. he has taken drug A 10 times, drug B 1 time and drug C 15 times. So now as a feature value for drug A, are we trying to do something like drug A/Total number of records in this feature type( which is drug)?

Will it look like below

Patient | Drug A | Drug B | Drug C Jack | 10/26 | 1/26 | 15/26

jmbanda commented 4 years ago

The normalization and the separation of feature types are two completely different things as I explained. The 'division of feature types' is maybe an incorrect wording to saying what I already said, that the domains are separated in the feature vector, this has no bearing on the normalization scheme selected.