commonsense / conceptnet5

Code for building ConceptNet from raw data.
Other
2.76k stars 352 forks source link

https://www.conceptnet.io/c/en/talbe #332

Open garyphilip opened 1 year ago

garyphilip commented 1 year ago

Are apparent typos like this ever recognized and pruned?

Just the fact that talbe has only one connection to another node in the data would seem to suggest something is wrong.

In particular, it has an "is made of" relationship to {tree} which has many interconnections. Can anything be made of something ubiquitous and yet have no other connections? I suppose that is possible as new data gets added, but over time it seems it should become suspect.

Maybe some other algorithm?

havasi commented 1 year ago

Hi Gary,

Sorry for taking a bit to get back to you! Typos tend to be handled by having a low weight - most folks and algorithms filter out low weight links. By default we don't include them in, say Numberbatch.

On Fri, Sep 23, 2022 at 12:11 AM garyphilip @.***> wrote:

Are apparent typos like this ever recognized and pruned?

Just the fact that talbe has only one connection to another node in the data would seem to suggest something is wrong.

In particular, it has an "is made of" relationship to {tree} which has many interconnections. Can anything be made of something ubiquitous and yet have no other connections? I suppose that is possible as new data gets added, but over time it seems it should become suspect.

Maybe some other algorithm?

— Reply to this email directly, view it on GitHub https://github.com/commonsense/conceptnet5/issues/332, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABZOXQDXLQPCRS33NJNATTV7UUYVANCNFSM6AAAAAAQTUARKM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

garyphilip commented 1 year ago

Thank you, Catherine, for your helpful and informative reply.

Gary

On Mon, 17 Oct 2022 18:48:40 -0700 havasi @.***> wrote:

Hi Gary,

Sorry for taking a bit to get back to you! Typos tend to be handled by having a low weight - most folks and algorithms filter out low weight links. By default we don't include them in, say Numberbatch.

  • Catherine

On Fri, Sep 23, 2022 at 12:11 AM garyphilip @.***> wrote:

Are apparent typos like this ever recognized and pruned?

Just the fact that talbe has only one connection to another node in the data would seem to suggest something is wrong.

In particular, it has an "is made of" relationship to {tree} which has many interconnections. Can anything be made of something ubiquitous and yet have no other connections? I suppose that is possible as new data gets added, but over time it seems it should become suspect.

Maybe some other algorithm?

— Reply to this email directly, view it on GitHub https://github.com/commonsense/conceptnet5/issues/332, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABZOXQDXLQPCRS33NJNATTV7UUYVANCNFSM6AAAAAAQTUARKM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Reply to this email directly or view it on GitHub: https://github.com/commonsense/conceptnet5/issues/332#issuecomment-1281703589 You are receiving this because you authored the thread.

Message ID: @.***>

-- Gary Grosso @.***>

-- This email has been checked for viruses by Avast antivirus software. www.avast.com