greenelab / lab-website-template

An easy-to-use, flexible website template for labs.
https://greenelab.github.io/lab-website-template/
BSD 3-Clause "New" or "Revised" License
364 stars 315 forks source link

Pubmed citations do not generate properly #210

Closed chrisamiller closed 1 year ago

chrisamiller commented 1 year ago

Checks

Link to your website repo

No response

Version of Lab Website Template you are using

1.1.5

Description

Pubmed citations don't appear to be working in the most recent release of lab-website-template. Upon loading a search into _data/pubmed.yaml such as this:

- term: "Ley, TJ[Author]"

Manubot seems to choke on it, and while it creates entries, they are missing essentially all key information:

[no author info]
[no publisher info]   ·   [no date info]   ·   pubmed:37339484 

citations.yamljust looks like this:

- id: pubmed:37339484
  term: Ley, TJ[Author]
  plugin: pubmed.py
  file: pubmed.yaml
- id: pubmed:37160317
  term: Ley, TJ[Author]
  plugin: pubmed.py
  file: pubmed.yaml

And the process logs show the following errors:

Compiling sources
Running google-scholar plugin
    Found 0 google-scholar* data file(s)
Running pubmed plugin
    Found 1 pubmed* data file(s)
    Processing data file pubmed.yaml
        Processing entry 1 of 1, term: Ley, TJ[Author] (from cache)
            id: pubmed:37339484
            id: pubmed:37160317
            id: pubmed:37079859
. . .

Running orcid plugin
    Found 0 orcid* data file(s)
Running sources plugin
    Found 0 sources* data file(s)
Merging sources by id
259 total source(s) to cite

--------------------

Generating citations
Processing source 1 of 259, id: pubmed:37339484
    Using Manubot to generate citation## WARNING
Generating csl_item for 'pubmed:37339484' failed due to a ImportError:
cannot import name 'RequestRate' from 'pyrate_limiter' (/usr/local/lib/python3.9/dist-packages/pyrate_limiter/__init__.py)

            Couldn't parse Manubot response

Processing source 2 of 259, id: pubmed:37160317
    Using Manubot to generate citation       Jekyll Feed: Generating feed for posts
## WARNING
Generating csl_item for 'pubmed:37160317' failed due to a ImportError:
cannot import name 'RequestRate' from 'pyrate_limiter' (/usr/local/lib/python3.9/dist-packages/pyrate_limiter/__init__.py)

            Couldn't parse Manubot response

And so on.

Some research shows that there's an underlying issue with a package manubot depends on: https://github.com/manubot/manubot/issues/367

And that they appear to have since fixed it. by requiring specific versions of the package in question. Is this fix something that can be pushed into the lab-website-template repo so that pubmed functionality works again?

Thanks!

vincerubinetti commented 1 year ago

~Indeed, all that needs to be done is updating the version of Manubot in your /_cite/requirements.txt file. You can either upgrade to Manubot v0.5.6, or unpin the exact version as I have done for all packages in #212 .~

See comments below for immediate fix.

I apologize for not fixing this sooner. I thought it was a more niche edge case, so I didn't prioritize it.

chrisamiller commented 1 year ago

Thanks for the quick reponse, and apologies if I'm misunderstanding something here. I updated the _cite/requirements.txt to read manubot==0.5.6 (instead of 0.5.5) and then relaunched docker via the ./.docker/run.sh script. Docker does appear to build a new image with the updated requirements, but then still gives the same string of errors when parsing that pubmed.yaml

Processing source 2 of 259, id: pubmed:37160317
    Using Manubot to generate citation## WARNING
Generating csl_item for 'pubmed:37160317' failed due to a ImportError:
cannot import name 'RequestRate' from 'pyrate_limiter' (/usr/local/lib/python3.9/dist-packages/pyrate_limiter/__init__.py)

            Couldn't parse Manubot response
chrisamiller commented 1 year ago

I also tried replicating that PR and making requirements.txt unpinned from all minor versions:

manubot==0.5.*
PyYAML==6.0.*
diskcache==5.4.*
rich==12.6.*
python-dotenv==0.21.*
google-search-results==2.4.*

Again, a new docker image is built, but it still chokes when importing from pubmed with the same errors.

chrisamiller commented 1 year ago

Sorry for the string of comments here! It looks like manubot 0.5.6 was released in February, but the PR linked there was just merged Sep 1st (so isn't in a release yet). https://github.com/manubot/manubot/commit/6e6f6a5aac381120faf3ef02e594b5babc77da2b

Do we need to poke the manubot folks for a new release to resolve this?

vincerubinetti commented 1 year ago

Ack you're right. I saw 0.5.6 on PyPI and assumed it included the fix. Yes, we need to prod them to push this fix. I will do so.

https://github.com/manubot/manubot/pull/368

vincerubinetti commented 1 year ago

If you want an immediate fix, I think you can do:

pip3 install git+https://github.com/manubot/manubot.git

which produces this in requirements.txt:

manubot @ git+https://github.com/manubot/manubot.git@6e6f6a5aac381120faf3ef02e594b5babc77da2b

So if you just replace your requirements.txt Manubot line with the above, it hopefully should work.

chrisamiller commented 1 year ago

That seems to have done the trick for now! I'll keep an eye out for that manubot release. Thanks!

vincerubinetti commented 1 year ago

Manubot 0.6.0 is out on PYPI, fixing the issue. LWT 1.1.6 is out with a change to the python dependency versions.