JabRef / jabref

Graphical Java application for managing BibTeX and biblatex (.bib) databases
https://devdocs.jabref.org
MIT License
3.62k stars 2.58k forks source link

"Download linked file" option creates an html file instead of downloading the pdf on Windows 10. #10149

Closed bhatiaa1 closed 6 months ago

bhatiaa1 commented 1 year ago

JabRef version

Latest development branch build (please note build date below)

Operating system

Windows

Details on version and operating system

Windows 10 21H2

Checked with the latest development build

Steps to reproduce the behaviour

"Download Linked File" option on JabRef 5.4 works fine within the same corporate firewalled network environment and on the same Windows 10 computer.

  1. Download and unzip JabRef 5.10 portable within a corporate firewall environment on Windows 10 computer, post install details as follows. JabRef 5.10--2023-08-07--3d197a9 Windows 10 10.0 amd64 Java 21-internal JavaFX 20+19

(NOTE: Same issue is also observed in admin install of JabRef version as follows. JabRef 5.10--2023-07-18--34ac6d1 Windows 10 10.0 amd64 Java 21-internal JavaFX 20+19)

  1. Import SSL certificate within Preferences->Network. Use the custom proxy option within Preferences->Network.

  2. Verify internet connectivity using the button in Preferences->Network. Confirmed that it works.

  3. Go to Jabref WebSearch, choose ArXiv for searching. Search for "https://arxiv.org/abs/2307.14043".

  4. Select "Download linked file" option in resulting search result and import the entry into bib file.

RESULT: The entry is imported but an empty html file is created instead of downloading actual pdf from ArXiv website.

Appendix

Screenshot of erroneous file download ![image](https://github.com/JabRef/jabref/assets/88478555/36f29c92-ba0c-4172-a189-7337847c91f2)
ThiloteE commented 1 year ago

@calixtus - Is this related to #10044?

Siedlerchr commented 1 year ago

I think was already an earlier behaviour, if the download fails for some reason the website is stored. refs https://github.com/JabRef/jabref/issues/7452

calixtus commented 1 year ago

@calixtus - Is this related to #10044?

Nope, PR 10044 is about storing the password between the sessions.

koppor commented 1 year ago

This happens if JabRef does not have access to the web site. Sometimes, the browser has access (e.g., by institutional login), but JabRef does not. Proposal:

  1. Do not download HTML, but issue a warning. (Alternatively, ask the user if they really want to download HTML)
  2. Update Try to use https://github.com/JetBrains/jcef to establish a connection.

Old step 2 was: Work on a JabRef-Browser-connection so that JabRef does not do the HTTP connection itself, but uses the running web browser (e.g. Chrome or Firefox) to do it. -- That seems to be too hard.

Mustakeem733 commented 9 months ago

Hi @koppor I think the issue is related to Windows 10 because in my system its 11 and I have tried the url given by @bhatiaa1 , by using web search and checked the "Download Linked Online files" and imported it. The pdf was downloaded and saved in the folder and opening fine and also tried using general tab and downloaded the html file its also downloaded and saved the html fine with the html code.

koppor commented 9 months ago

I checked on Linux - and I can retrieve the PDF. Nevertheless, my comment https://github.com/JabRef/jabref/issues/10149#issuecomment-1707778048 is still vallid. Maybe, one needs to have another test url.

Example paper (randomly chosen from iEEE: https://ieeexplore.ieee.org/document/8089824)

  1. Web search
  2. Select "IEEEXplore"
  3. Enter Automating the Provisioning and Integration of Analytics Tools with Data Resources in Industrial Environments Using OpenTOSCA as search term
  4. Click on "Search"
  5. Import dialog appears
  6. Select first entry
  7. Ensure that "Download linked online files" is active
  8. Click on "Import entries"
  9. JabRef outputs "Downloaded websites as an HTML file."

Step 9 should not happen. JabRef should just issue a warning (notification). If there is no dialogService available, do just skip.

I updated my comment https://github.com/JabRef/jabref/issues/10149#issuecomment-1707778048 to point to https://github.com/JetBrains/jcef.

Guanzhw commented 4 months ago

I try Jabref for the first time today. I love the "citation info" and "citation relations". But I also encounter this problem. The downloaded html is actually a machine-human test. I can access the file manually through the browser. 1

Hope this would help!

koppor commented 4 months ago

To fix this, huge work needs to be done. I think, this will be fixed if we implement https://github.com/JabRef/jabref/issues/11093.

I will leave the issue open for other users searching for this kind of issue.