massimoaria / bibliometrix

An R-tool for comprehensive science mapping analysis. A package for quantitative research in scientometrics and bibliometrics.
https://www.bibliometrix.org
Other
500 stars 149 forks source link

Error in grep(x, M$CR[M$PY >= Year]) within localCitations() #34

Closed mtorressahli closed 5 years ago

mtorressahli commented 5 years ago

Hello

First of all, many thanks for this package. It is awesome.

Using an Scopus database I've been getting this error when using either histNetwork or localCitations:

Articles analysed 100 Articles analysed 200 Articles analysed 300 Error in grep(x, M$CR[M$PY >= Year]) : invalid regular expression, reason 'Invalid character range'

I think it may be due to grep inside the histNetwork not playing well with a particular pattern in title fields. Some non-English articles have the title in a second language between "[ ]" as in "PREJUDICE AGAINST FAT PEOPLE [PREJUICIO CONTRA PERSONAS OBESAS]". The brackets are interpreted as special characters.

I worked it around replacing all "[...]" for "(...)" in titles within the output of readFiles before using convert2df, but it surely deserved to be noticed in case it could be avoided in the package itself.

Not sure if the better solution would be the one I used, or maybe escaping those brackets before pos = grep(x, M$CR[M$PY >= Year]) within histNetwork, but it may be useful to implement either of those.

Best Manuel

Offtopic: I created a couple of helping functions to allow me homogeneise titles within the Scopus database more easily. Not sure where I could share it with others in case they are helpful.

massimoaria commented 5 years ago

I followed your advice. I think now the issue is solved.

Regarding: "Offtopic: I created a couple of helping functions to allow me homogeneise titles within the Scopus database more easily. Not sure where I could share it with others in case they are helpful." Please, send me your function. I could evaluate the possibility to integrate them in bibliometrix.

silviaegt commented 4 years ago

Hey @massimoaria, I second the congratulations for this amazing package! I got this same error today (I just downloaded this package and have been playing around):

Error in grep(y, M$CR[M$PY >= Year]) : 
  invalid regular expression '\(2014\) HOW ACCURATE ARE WIKIPEDIA ARTICLES IN HEALTH, NUTRITION, AND MEDICINE? [LES ARTICLES DE WIKIPDIA DANS LES DOMAINES DE LA SANT, DE LA NUTRITION ET DE LA MDECINE SONT-ILS EXACTS?]', reason 'Invalid character range'

I read @mtorressahli's post and replaced all squared brackets in the title with the following function:

df$TI <- chartr("[]", "()", df$TI)

That solved my problem as well, and I could run: citados_local <- localCitations(df, sep = ";")

Without problems.

Hope this is useful :)

massimoaria commented 4 years ago

Hey @massimoaria, I second the congratulations for this amazing package! I got this same error today (I just downloaded this package and have been playing around):

Error in grep(y, M$CR[M$PY >= Year]) : 
  invalid regular expression '\(2014\) HOW ACCURATE ARE WIKIPEDIA ARTICLES IN HEALTH, NUTRITION, AND MEDICINE? [LES ARTICLES DE WIKIPDIA DANS LES DOMAINES DE LA SANT, DE LA NUTRITION ET DE LA MDECINE SONT-ILS EXACTS?]', reason 'Invalid character range'

I read @mtorressahli's post and replaced all squared brackets in the title with the following function:

df$TI <- chartr("[]", "()", df$TI)

That solved my problem as well, and I could run: citados_local <- localCitations(df, sep = ";")

Without problems.

Hope this is useful :)

Thanks