berkeleybop / artificial-intelligence-ontology

An ontology modeling classes and relationships describing deep learning networks, their component layers and activation functions, machine learning methods, as well as AI/ML potential biases.
https://berkeleybop.github.io/artificial-intelligence-ontology/
10 stars 2 forks source link

genus/differentia aka Aristotelian definitions #75

Closed turbomam closed 3 weeks ago

turbomam commented 3 weeks ago

will require human review

turbomam commented 3 weeks ago

I submitted src/ontology/aio-src.csv and the following prompt to GPT 4

This is a ROBOT template for generating an OWL ontology. Change the values in the Description column to follow the Aristotelian genus/differentia style. The label of the parent should be used as the genus.

then

save the changes back to a CSV file?

turbomam commented 3 weeks ago

aio-updated.csv

turbomam commented 3 weeks ago

that's terrible

turbomam commented 3 weeks ago

let's start over. do not try to implement a solution with code. just use your large language model capabilities to combine the existing description and the label of the asserted parent class into a new description that flows well linguistically

turbomam commented 3 weeks ago

remove all of the "...known as..." fragments

turbomam commented 3 weeks ago

good. I want to download that as a csv file.

turbomam commented 3 weeks ago

aio-updated-final.csv

GPT is pretty confident of itself with that file name

turbomam commented 3 weeks ago

would still need a lot of manual editing. will try Claude opus again now.

turbomam commented 3 weeks ago

Repeated earlier GPT 4 upload/prompt ("This is a ROBOT template...") with Claude opus.

The Claude web interface doesn't want to provide a downloadable CSV and the output is too long to emit all at once. So I have asked for output containing only the ID and the new description.

turbomam commented 3 weeks ago

May have to do it in batches or use llm

turbomam commented 3 weeks ago

last full row:

AIO:Convolution2DTransposeLayer,A layer that implements transposed 2D convolution sometimes called deconvolution.

So then I just ask "The last row I saw was AIO:Convolution2DTransposeLayer. Please resume on the next row"

turbomam commented 3 weeks ago

Some duplicates:

AIO:LazyBatchNorm1DLayer,A batch normalization layer with lazy initialization of the num_features argument that is inferred from the input size. AIO:LazyBatchNorm2DLayer,A batch normalization layer with lazy initialization of the num_features argument that is inferred from the input size. AIO:LazyBatchNorm3DLayer,A batch normalization layer with lazy initialization of the num_features argument that is inferred from the input size.

turbomam commented 3 weeks ago

So I prompted:

I see some duplicates in there, like "AIO:LazyBatchNorm1DLayer,A batch normalization layer with lazy initialization of the num_features argument that is inferred from the input size.AIO:LazyBatchNorm2DLayer,A batch normalization layer with lazy initialization of the num_features argument that is inferred from the input size.AIO:LazyBatchNorm3DLayer,A batch normalization layer with lazy initialization of the num_features argument that is inferred from the input size.". Please do your best to ensure unique descriptions.

turbomam commented 3 weeks ago

Claude's generation of Aristotelian descriptions completed and I added it as a new column in the google sheet

turbomam commented 3 weeks ago

Now I loaded the sheet into GPT 4 and asked

determine the lexical differences between the "Legacy Description" and "Claude opus Aristotelian definition" columns. I would use a jacquard or k-mer approach, but I trust you to choose what's best. generate a tsv file with tow columns: the ID and the difference.

turbomam commented 3 weeks ago

I had to convince it that I wanted a numerical diffrence score, but it finally did emit them and they look reasonable, as if a formula was applied.

GPT 4 ultimately said that it was going to use (1 - (jaccard similarity))

I added the column and set the sheet up with a filter view and made some of the columns narrower and wrapped.

It looks like some of the description were really long and have content that should be extracted out into more comments.

turbomam commented 3 weeks ago

Should double check the Jaccard scores with offline logic

turbomam commented 3 weeks ago

recommend closing