MaxKuehn / zotero-scholar-citations

Zotero plugin for auto-fetching numbers of citations from Google Scholar
264 stars 23 forks source link

Suggestion #5

Open MaxwellGuo opened 5 years ago

MaxwellGuo commented 5 years ago

see more in https://github.com/beloglazov/zotero-scholar-citations/issues/37#issuecomment-531533899

MaxKuehn commented 5 years ago

"Extra"-field usage:

It's on the Roadmap, but it needs to be supported by Zotero first. According to this discussion it's planned for Zotero 5.2.

The Extra field is the only field, that doesn't have a particular purpose. All other fields are added by/to Zotero for some reason and high-jacking them doesn't seem like a good Idea to me.

I'd love to an actual custom field that's an actual number and not a string so I can throw away all the string-parsing shinanigans.

Right now the plugin does a best effort of what's discussed at the end in the link you included in your comment:

DOI

Get number of citation through DOI, instead of TITLE, will be better. It will more accurate

The Problem with DOI is, that it has to be supported by the publisher. So relying exclusively on DOIs is problematic. I'm thinking about something like: if there's a DOI use that, otherwise use Title + Authors.

Also, what happens if publishers start including DOIs in the reference lists for the articles? That might push the "wrong" paper up in the search results just because it's more popular.

Other Search Engines

Add other search engines, for example, Bing Academic. See also https://forums.zotero.org/discussion/77638/citation-count/p1

If I ever do that, it's gonna be another plugin or a plugin which integrates all sorts of citation/reference utility. As long as it's the zotero-scholar-citations it's gonna be GS only. Mostly because fiddling with search results and not having a proper API is a fucking pain.

MaxwellGuo commented 5 years ago

"Extra"-field usage: At least, you can add a function that it can allow users choose which field to get number of citation.

DOI You can use the following sites or similar sites for indirect search https://academicguides.waldenu.edu/library/exactarticle I use other search engines, instead of google scholar. I don't know whether google scholar could use some command to search article by DOI.

Other Search Engines I understand it. Google Scholar is good enough, but for following reasons, it is not a best choose for some users:

MaxKuehn commented 5 years ago

At least, you can add a function that it can allow users choose which field to get number of citation.

I could, but that messes with the content of fields that are potentially used by Zotero and/or other plugins.

I don't know whether google scholar could use some command to search article by DOI.

Unfortunately, it doesn't, which is why I have to stick to those parameters which google supports ...

For Chinese users, we must use proxy to do google search, alas,

You probably mean a web-proxy, right? An actual proxy shouldn't make a difference. The web-proxy problem might actually be solvable by making the base URL configure-able. The only problem: the UI! Right now the plugin doesn't have any config UI and while adding one configuration parameter would be pretty easy, adding an entire config dialog plus that one parameter takes a lot of time and effort.

The Google robot is troublesome,

Can you elaborate on that?

Bing Academic is better than Google Scholar in some place.

I actually spend a lot of time thinking about this and similar things when I went through all the issues. As soon as you add another search engine you get a whole bunch of problems. Bing says the reference count is 5, Google Scholar says it's 6. Who's right? In a way both are, because it's just the number of references each of them have indexed. Soo, do I use the arithmetic mean of both numbers? Or is one more important than the other. How about letting the user decide by adding some weights in a config dialog to express which number is more "trusted". But then again the number citations isn't that great of a metric to begin with and we should probably use something like h-index instead. But none of the pages supports it in a use-able way, soo ... back to the roots: the name is zotero-scholar-citations. What does it do? Get the Google Scholar citation count! Nothing more, nothing less.

MaxwellGuo commented 5 years ago

Can you elaborate on that?

When we fetch data from Google Scholar too many times, Google robot will block us to fetch more data. We must wait long time before fetch data again. If there are other search engines, we could switch to another engine when we are blocked.

MaxwellGuo commented 5 years ago

The only problem: the UI!

Probably, it needn't a UI. Just some things users can modify, e.g. txt document (you just need to explain the usage in github 'readme' document). Is it possible to add some parameters to "config", which can be modified?

MaxKuehn commented 5 years ago

W-6hen we fetch data from Google Scholar too many times, Google robot will block us to fetch more data. We must wait long time before fetch data again. If there are other search engines, we could switch to another engine when we are blocked.

Ah, that's what you meant.

Probably, it needn't a UI. Just some things users can modify, e.g. txt document (you just need to explain the usage in github 'readme' document). Is it possible to add some parameters to "config", which can be modified?

Well, if it's not via UI and editing files is fine I might as well ask people to change the code directly. Right now you can

  1. download the .xpi file
  2. rename it it to a .zip file
  3. unpack it
  4. change line 9 in chrome/content/zsc.js to whatever webproxyied scholar link you want to use
  5. zip it again
  6. rename it back to .xpi
  7. install it

or

  1. checkout the source code.
  2. change line 9 in zsc.js
  3. do a
    npm install
    npm run package
  4. install the .xpi file from the build folder

where the unzip/zip and edit stuff is about as difficult as tracking down a config file that's somewhere in an OS and Zotero dependent directory.

But I'd just don't expect your average non-programmer to have the level of technical adeptness to do either one of those, so it's probably gonna require UI stuff.

MaxwellGuo commented 5 years ago

I have tried doing some modifications in your code. But I don't know how to clear 'extra' contents before fetch data and fetch data through 'doi'.

MaxwellGuo commented 5 years ago

what happens if publishers start including DOIs in the reference lists for the articles? That might push the "wrong" paper up in the search results just because it's more popular

In fact, I can always get right results throuh DOI, so I doubt whether the question is a true question. I have tried modifying your code, but I am not a programmer, I cannot solve it easily.

MaxKuehn commented 5 years ago

In fact, I can always get right results throuh DOI, so I doubt whether the question is a true question. I have tried modifying your code, but I am not a programmer, I cannot solve it easily.

What I meant to illustrate with that example is, that a DOI in a search result is a text-blob that happens to be on an indexed page. Whether or not that page actually is the "correct" result depends on how the indexed pages handle DOIs and is subject to arbitrary changes in the future. On op of that, if I take my collection as an example: almost half of the items do not even have DOIs.

The criteria the plugin is currently using on the other hand are properties that you can use when you execute an advanced search with google scholar. Every piece of literature has a title, author(s) and a date of publishing. In combination they, for the most part, uniquely identify a publication.

So if I have to prioritize between different things to implement next, why would I go with something that might or might not, temporarily, work for about half of a collection and, for the most part, not make any difference for the actual result?

MaxwellGuo commented 5 years ago

When we using DOI, the first literature is always the one we want to find. Is possible to t ake advantage of this feature to make it work better?

maxsu commented 4 years ago

Regarding other engines, like Microsoft:

  1. APIs - Microsoft's academic knowledge api could be a real advantage.
  2. Disagreement between services - Unify results from different engines
    • Improves accuracy of a paper's citation count (finds missed citations)
    • Biggest challenge: A forward version of the citation matching problem (Compute the overlap of 2 forward citation lists)
    • DOI goes a long way here

As for renaming, I agree we'd need to reconceptualize the addon and start a new project, like "Zotero-Forward-Citations" or "FCite".