rajatomar788 / pywebcopy

Locally saves webpages to your hard disk with images, css, js & links as is.
https://rajatomar788.github.io/pywebcopy/
Other
527 stars 106 forks source link

ValueError: path is on mount 'S:', start on mount 'C:' #56

Open kevtv opened 4 years ago

kevtv commented 4 years ago

I am getting this error, Any idea why?

from pywebcopy import save_webpage

kwargs = {'project_name': 'some-fancy-name'}
url = 'https://www.bloomberg.com/company/career/global-data-externship/'

save_webpage(
    url=url,
    project_folder=r'C:\Users\Kevin\Downloads\Cheats',
    **kwargs
)

Log Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\threading.py", line 926, in _bootstrap_inner self.run() File "C:\ProgramData\Anaconda3\lib\threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "C:\ProgramData\Anaconda3\lib\site-packages\pywebcopy\elements.py", line 331, in run contents = self.replace_urls(req.content, self.repl) File "C:\ProgramData\Anaconda3\lib\site-packages\pywebcopy\elements.py", line 292, in replace_urls contents = CSS_URLS_RE.sub(repl, css_string) File "C:\ProgramData\Anaconda3\lib\site-packages\pywebcopy\elements.py", line 273, in repl url = pathname2url(relate(new_element.file_path, self.file_path)) File "C:\ProgramData\Anaconda3\lib\site-packages\pywebcopy\urls.py", line 438, in relate return os.path.join(os.path.relpath(target_dir, start_dir), os.path.basename(target_file)) File "C:\ProgramData\Anaconda3\lib\ntpath.py", line 562, in relpath path_drive, start_drive)) ValueError: path is on mount 'S:', start on mount 'C:'

rajatomar788 commented 4 years ago

Did you set the project_folder to an absolute path and not a relative one i.e. starting with a slash.

NickVeld commented 3 years ago

I met it to.

pywebcopy.config.setup_config(config["endpoint"], ".")

The error says that something tried to calculate the relative path between two pathes on the different drives. And it is defenetly a problem inside because there is no the "S:" mount in my case; and I suppose in @kevtv 's case too

NickVeld commented 3 years ago

As for "S:" drive appearance see https://bugs.python.org/issue44810

But the root problem is returning the url with quotes by url = match_obj.group(1) (I am getting "https://a/b" instead of https://a/b)

NickVeld commented 3 years ago

I am getting "https://a/b" instead of https://a/b because the regexp pattern includes quotes intentionally. But for what?

NickVeld commented 3 years ago

See for the solution: https://github.com/rajatomar788/pywebcopy/pull/73