FullFact / health-misinfo-shared

Raphael health misinformation project, shared by Full Fact and Google
MIT License
0 stars 0 forks source link

Make package self-contained by moving prompt data into src folder #159

Closed dcorney closed 1 month ago

dcorney commented 1 month ago

Overview

If we want to install this repo as a library called from somewhere else (e.g. Galactus), then the package needs to be self-contained. Specifically, the file full_in_context_labelled_data.csv should be inside the src folder, otherwise it's not included when we pip install raphael, leading to an error: No such file or directory: 'data/full_in_context_labelled_data.csv'

Requirements

mv data/full_in_context_labelled_data.csv src/health_misinfo_shared/full_in_context_labelled_data.csv

Then in [fine_tuning.py / infer_claims()](https://github.com/FullFact/health-misinfo-shared/blob/45ff2080f6f19e24ef522c125326e6e2daf0ae37/src/health_misinfo_shared/fine_tuning.py#L453C5-L453C25) set annotated_data_files = [( Path(file).parent / "full_in_context_labelled_data.csv" )]

Make the equivalent change in the __main__ block to keep the development script in sync. (i.e. correct the call to construct_in_context_examples() to use the new file location.)