typesense / typesense-docsearch-scraper

A fork of Algolia's awesome DocSearch Scraper, customized to index data in Typesense (an open source alternative to Algolia)
https://typesense.org/docs/guide/docsearch.html
Other
97 stars 36 forks source link

Respect TYPESENSE_PROTOCOL to set up TYPESENSE_PORT #15

Open tnir opened 2 years ago

tnir commented 2 years ago

Description

When TYPESENSE_PROTOCOL is set as HTTPS, set up TYPESENSE_PORT to 443 when TYPESENSE_PORT is not specified.

Steps to reproduce

n/a

Expected Behavior

Actual Behavior

$ docker run -it --env-file=$(pwd)/.env -e "CONFIG=$(cat $(pwd)/config.json | jq -r tostring)" typesense/docsearch-scraper
Traceback (most recent call last):
  File "/root/.local/share/virtualenvs/root-BuDEOXnJ/lib/python3.6/site-packages/requests/models.py", line 382, in prepare_url
    scheme, auth, host, port, path, query, fragment = parse_url(url)
  File "/root/.local/share/virtualenvs/root-BuDEOXnJ/lib/python3.6/site-packages/urllib3/util/url.py", line 394, in parse_url
    return six.raise_from(LocationParseError(source_url), None)
  File "<string>", line 3, in raise_from
urllib3.exceptions.LocationParseError: Failed to parse: https://CLUSTER.a1.typesense.net:None/collections/INDEX_TIMETIME

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/src/index.py", line 116, in <module>
    run_config(environ['CONFIG'])
  File "/root/src/index.py", line 43, in run_config
    typesense_helper.create_tmp_collection()
  File "/root/src/typesense_helper.py", line 30, in create_tmp_collection
    self.typesense_client.collections[self.collection_name_tmp].delete()
  File "/root/.local/share/virtualenvs/root-BuDEOXnJ/lib/python3.6/site-packages/typesense/collection.py", line 22, in delete
    return self.api_call.delete(self._endpoint_path())
  File "/root/.local/share/virtualenvs/root-BuDEOXnJ/lib/python3.6/site-packages/typesense/api_call.py", line 159, in delete
    params=params, timeout=self.config.connection_timeout_seconds)
  File "/root/.local/share/virtualenvs/root-BuDEOXnJ/lib/python3.6/site-packages/typesense/api_call.py", line 129, in make_request
    raise last_exception
  File "/root/.local/share/virtualenvs/root-BuDEOXnJ/lib/python3.6/site-packages/typesense/api_call.py", line 103, in make_request
    r = fn(url, headers={ApiCall.API_KEY_HEADER_NAME: self.config.api_key}, **kwargs)
  File "/root/.local/share/virtualenvs/root-BuDEOXnJ/lib/python3.6/site-packages/requests/api.py", line 161, in delete
    return request('delete', url, **kwargs)
  File "/root/.local/share/virtualenvs/root-BuDEOXnJ/lib/python3.6/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/root/.local/share/virtualenvs/root-BuDEOXnJ/lib/python3.6/site-packages/requests/sessions.py", line 528, in request
    prep = self.prepare_request(req)
  File "/root/.local/share/virtualenvs/root-BuDEOXnJ/lib/python3.6/site-packages/requests/sessions.py", line 466, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "/root/.local/share/virtualenvs/root-BuDEOXnJ/lib/python3.6/site-packages/requests/models.py", line 316, in prepare
    self.prepare_url(url, params)
  File "/root/.local/share/virtualenvs/root-BuDEOXnJ/lib/python3.6/site-packages/requests/models.py", line 384, in prepare_url
    raise InvalidURL(*e.args)
requests.exceptions.InvalidURL: Failed to parse: https://CLUSTER.a1.typesense.net:None/collections/INDEX_TIMETIME

Metadata

Typsense Version: unknown

$ docker image ls typesense/docsearch-scraper
REPOSITORY                    TAG       IMAGE ID       CREATED        SIZE
typesense/docsearch-scraper   latest    c099811c8bfa   4 months ago   1.76GB