WormBase / ACKnowledge

Author Curation to Knowledgebases
MIT License
1 stars 1 forks source link

add tazendra user and passwd to jenkins config for afp pipeline #172

Closed valearna closed 3 years ago

valearna commented 4 years ago

To be done before release

azurebrd commented 4 years ago

Which user password ?

valearna commented 4 years ago

@azurebrd The new pipeline makes url requests to download pdfs from tazendra passing username and password to the requests. This is to avoid auth errors when on a VPN IP address not in tazedra's whitelist. This ticket is a reminder for me to modify the configuration on jenkins to pass the username and password as arguments to the pipeline before releasing version 2.0.

azurebrd commented 4 years ago

@valearna Oh, is that for authors to see the PDFs or curators ? In theory curators should be able to login and download, and I don't think we're supposed to let non-Caltech people see the PDFs (although we probably have). I'm assuming that even though the user/pass are stored somewhere, it's secure ?

valearna commented 4 years ago

@azurebrd This is only for the pipeline, which does not give access to the pdfs for curators or authors. The pipeline runs in the background weekly, fetches newly added pdfs from tazendra, extracts entities and stores them to postgres. Username and password will be stored in jenkins config file, which is on textpressocentral server and secured by authentication.

azurebrd commented 4 years ago

@valearna ooooh, cool cool, thanks. I'd thought the pipeline would get the text from the papers from textpresso. I didn't realize that this was an update to that pipeline, or that it had to be separate for some reason.

valearna commented 4 years ago

@azurebrd we are working on a major update to the pipeline and I added this feature since I first encountered the VPN issue few months ago. We get the pdfs from tazendra instead of textpresso since in some cases textpresso fails to convert the pdf to text, but we are able to do the conversion through a python library.