nleguillarme / taxonerd

TaxoNERD : recognizing taxonomic entities using deep models
MIT License
38 stars 8 forks source link

Issue with "inhomogeneous shape" during Entity Linking in v1.5.0 #12

Closed maxfarrell closed 1 year ago

maxfarrell commented 1 year ago

I previously used taxonerd v1.3.3 via R in a two step process, first to identify entities with an initialized model without entity linking, then in a second step linked these entities to NCBI with the same model initialized with a linker.

Now updating to use v.1.5.0 and the new weak_md model I come across the following error when attempting to use find.in.text initialized with a linker:

Error in py_call_impl(callabe, dots$args, dots$keywords) : 
  ValueError: setting an array element with a sequence. The requested array has an inhomogenous shape after 1 dimensions. The detected shape was (2,) + inhomogenous part.

I get this error initializing the model as in the vignette:

init.taxonerd(model = "en_core_eco_md", exclude=list("tagger", "attribute_ruler", "lemmatizer", "parser"), linker="ncbi_taxonomy", thresh=0.85, gpu=FALSE)

I also get the same error trying with ncbi_taxonomy as linker, and removing the "exclude" argument.

nleguillarme commented 1 year ago

Hi @maxfarrell I managed to reproduce the error, I will investigate it as soon as possible.

nleguillarme commented 1 year ago

It should be fixed in v1.5.1

maxfarrell commented 1 year ago

I installed v1.5.1 and re-installed the model en_eco_weak_md:

install.packages("https://github.com/nleguillarme/taxonerd/releases/download/v1.5.1/taxonerd_for_R_1.5.1.tar.gz", repos=NULL)
library(taxonerd);packageVersion("taxonerd")# 1.5.1
install.model(model="en_core_eco_weak_md", version="1.0.0")

But I am still getting the "inhomogeneous shape" error when attemting to do entity linking via NCBI.

nleguillarme commented 1 year ago

Hmmm that's weird... Could you please try the following : remove the r-taxonerd virtual environment (in ~/.virtualenvs/ on Linux), then do a fresh install using install.taxonerd()

maxfarrell commented 1 year ago

Works now - thanks!