Minor Issues in final_true_dist and gpt4_demographics

Lassehhansen commented 1 year ago

Hi,

I am trying to reproduce some of your analysis, here I encountered a few minor issues.

1: In "final_true_dist.csv" there are only 12 out of 20 diseases, also some of the numbers are given in fractions and some in percentages:

https://github.com/elehman16/gpt4_bias/blob/main/data_to_share/simulated_pt_distribution/true_dist_work/final_true_dist.csv

2: There are no row names row names in the "gpt4_demographics.csv", so I am assuming that it follows the structure from Figre 1 in the preprint: https://www.medrxiv.org/content/10.1101/2023.07.13.23292577v1.full.pdf

Besides that, good work!

Best, Lasse

Lassehhansen commented 1 year ago

Also for "Lupus" in the true distribution the numbers add up to 1.010278 and not exactly 1

elehman16 commented 1 year ago

Whoops! Great finds. I will work on fixing and let you know when they are fixed!

Lassehhansen commented 8 months ago

Hello once more, I'm interested in utilizing the demographic ratios for the true prevalence in the USA and the GPT-4 estimates as depicted in Figure 1 of your paper:

https://www.thelancet.com/action/showPdf?pii=S2589-7500%2823%2900225-X

Could you guide me on where to locate these details?

elehman16 / gpt4_bias

Minor Issues in final_true_dist and gpt4_demographics #2