lukasschwab / arxiv.py

Python wrapper for the arXiv API
MIT License
1.11k stars 123 forks source link

Error when downloading paper with '/' in short id #117

Closed larstz closed 1 year ago

larstz commented 1 year ago

Description

If the short id of a paper contains a '/' downloading the source or pdf without providing a filename results in FileNotFoundError: [Errno 2] No such file or directory: './hep-ex/0406020v1.Sparticle_Reconstruction_at_LHC.pdf'

I guess the reason for this is the behaviour of the _get_default_filename() method, as it uses as filename the entire get_short_id including the '/'.

Similar for papers with '/' in the title see for example search = arxiv.Search(id_list=['1211.0496v1'])

Steps to reproduce

search = arxiv.Search(id_list=['hep-ex/0406020v1'])
paper = next(search.results())
paper.download_pdf()

Expected behavior

Downloading the source or pdf of a paper without providing the filename should work the same way for every paper. Maybe one could replace all '/' contained with character that can not be confused with a directory (e.g. '.' like in most paper ids).

Versions

lukasschwab commented 1 year ago

@larstz great bug report — thanks for opening it!

Maybe one could replace all '/' contained with character that can not be confused with a directory (e.g. '.' like in most paper ids).

This seems like a sensible design. It might also be possible to escape the slash, but I'm not sure if that's desirable/portable; I can do some reading and get a fix out in ~12 hours. Feel free to open a .-substitution PR before then!

lukasschwab commented 1 year ago

@larstz patched and released. Thanks again for the bug report! 🙇