Closed turbomam closed 3 weeks ago
I submitted src/ontology/aio-src.csv
and the following prompt to GPT 4
This is a ROBOT template for generating an OWL ontology. Change the values in the Description column to follow the Aristotelian genus/differentia style. The label of the parent should be used as the genus.
then
save the changes back to a CSV file?
that's terrible
let's start over. do not try to implement a solution with code. just use your large language model capabilities to combine the existing description and the label of the asserted parent class into a new description that flows well linguistically
remove all of the "...known as..." fragments
good. I want to download that as a csv file.
GPT is pretty confident of itself with that file name
would still need a lot of manual editing. will try Claude opus again now.
Repeated earlier GPT 4 upload/prompt ("This is a ROBOT template...") with Claude opus.
The Claude web interface doesn't want to provide a downloadable CSV and the output is too long to emit all at once. So I have asked for output containing only the ID and the new description.
May have to do it in batches or use llm
last full row:
AIO:Convolution2DTransposeLayer,A layer that implements transposed 2D convolution sometimes called deconvolution.
So then I just ask "The last row I saw was AIO:Convolution2DTransposeLayer. Please resume on the next row"
Some duplicates:
AIO:LazyBatchNorm1DLayer,A batch normalization layer with lazy initialization of the num_features argument that is inferred from the input size. AIO:LazyBatchNorm2DLayer,A batch normalization layer with lazy initialization of the num_features argument that is inferred from the input size. AIO:LazyBatchNorm3DLayer,A batch normalization layer with lazy initialization of the num_features argument that is inferred from the input size.
So I prompted:
I see some duplicates in there, like "AIO:LazyBatchNorm1DLayer,A batch normalization layer with lazy initialization of the num_features argument that is inferred from the input size.AIO:LazyBatchNorm2DLayer,A batch normalization layer with lazy initialization of the num_features argument that is inferred from the input size.AIO:LazyBatchNorm3DLayer,A batch normalization layer with lazy initialization of the num_features argument that is inferred from the input size.". Please do your best to ensure unique descriptions.
Claude's generation of Aristotelian descriptions completed and I added it as a new column in the google sheet
Now I loaded the sheet into GPT 4 and asked
determine the lexical differences between the "Legacy Description" and "Claude opus Aristotelian definition" columns. I would use a jacquard or k-mer approach, but I trust you to choose what's best. generate a tsv file with tow columns: the ID and the difference.
I had to convince it that I wanted a numerical diffrence score, but it finally did emit them and they look reasonable, as if a formula was applied.
GPT 4 ultimately said that it was going to use (1 - (jaccard similarity))
I added the column and set the sheet up with a filter view and made some of the columns narrower and wrapped.
It looks like some of the description were really long and have content that should be extracted out into more comments.
Should double check the Jaccard scores with offline logic
recommend closing
will require human review