VikParuchuri / textbook_quality

Generate textbook-quality synthetic LLM pretraining data
MIT License
488 stars 50 forks source link

password authentication failed when run invoke migrate-dev #9

Closed luotongml closed 1 year ago

luotongml commented 1 year ago

I run the program in the wsl2 Ubuntu on windows 11. After steps of psql postgres -c "create database textbook;" git clone https://github.com/VikParuchuri/textbook_quality.git cd textbook_quality poetry install poetry shell

Errors ou when run the command invoke migrate-dev I am not very familiar with postgresql, I created role tluo with access db textbook. can't tell why there is problem. Thanks for your help.

Traceback (most recent call last): File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/bin/alembic", line 8, in sys.exit(main()) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/alembic/config.py", line 630, in main CommandLine(prog=prog).main(argv=argv) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/alembic/config.py", line 624, in main self.run_cmd(cfg, options) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/alembic/config.py", line 601, in run_cmd fn( File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/alembic/command.py", line 399, in upgrade script.run_env() File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/alembic/script/base.py", line 578, in run_env util.load_python_file(self.dir, "env.py") File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/alembic/util/pyfiles.py", line 93, in load_python_file module = load_module_py(module_id, path) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/alembic/util/pyfiles.py", line 109, in load_module_py spec.loader.exec_module(module) # type: ignore File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "/home/tluo/llm/textbook_quality/alembic/env.py", line 108, in run_migrations_online() File "/home/tluo/llm/textbook_quality/alembic/env.py", line 102, in run_migrations_online asyncio.run(run_async_migrations()) File "/home/tluo/anaconda3/lib/python3.10/asyncio/runners.py", line 44, in run return loop.run_until_complete(main) File "/home/tluo/anaconda3/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete return future.result() File "/home/tluo/llm/textbook_quality/alembic/env.py", line 65, in run_async_migrations async with connectable.connect() as connection: File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/ext/asyncio/base.py", line 60, in aenter return await self.start(is_ctxmanager=True) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/ext/asyncio/engine.py", line 157, in start await (greenlet_spawn(self.sync_engine.connect)) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 126, in greenlet_spawn result = context.throw(sys.exc_info()) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/future/engine.py", line 406, in connect return super(Engine, self).connect() File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 3315, in connect return self._connection_cls(self, close_with_result=close_with_result) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 96, in init else engine.raw_connection() File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 3394, in raw_connection return self._wrap_pool_connect(self.pool.connect, _connection) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 3361, in _wrap_pool_connect return fn() File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 320, in connect return _ConnectionFairy._checkout(self) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 884, in _checkout fairy = _ConnectionRecord.checkout(pool) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 486, in checkout rec = pool._do_get() File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/pool/impl.py", line 256, in _do_get return self._create_connection() File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 266, in _create_connection return _ConnectionRecord(self) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 381, in init self.connect() File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 677, in connect with util.safereraise(): File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 70, in exit compat.raise( File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 208, in raise_ raise exception File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 673, in __connect self.dbapi_connection = connection = pool._invoke_creator(self) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/create.py", line 578, in connect return dialect.connect(cargs, cparams) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 598, in connect return self.dbapi.connect(*cargs, *cparams) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 780, in connect await_only(self.asyncpg.connect(arg, kw)), File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 68, in await_only return current.driver.switch(awaitable) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 121, in greenlet_spawn value = await result File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/asyncpg/connection.py", line 2114, in connect return await connect_utils._connect( File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/asyncpg/connect_utils.py", line 982, in _connect conn = await _connect_addr( File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/asyncpg/connect_utils.py", line 817, in _connect_addr return await __connect_addr(params_retry, timeout, False, *args) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/asyncpg/connect_utils.py", line 866, in __connect_addr await compat.wait_for(connected, timeout=timeout) File "/home/tluo/.cache/pypoetry/virtualenvs/textbook-quality-s5WH1yUv-py3.10/lib/python3.10/site-packages/asyncpg/compat.py", line 60, in wait_for return await asyncio.wait_for(fut, timeout) File "/home/tluo/anaconda3/lib/python3.10/asyncio/tasks.py", line 445, in wait_for return fut.result() asyncpg.exceptions.InvalidPasswordError: password authentication failed for user "tluo"

poetry show aiohttp 3.8.6 Async http client/server framework (asyncio) aiosignal 1.3.1 aiosignal: a list of registered asynchronous callbacks alembic 1.12.0 A database migration tool for SQLAlchemy. anyio 4.0.0 High level compatibility layer for multiple asynchronous event loop implementations argon2-cffi 23.1.0 Argon2 for Python argon2-cffi-bindings 21.2.0 Low-level CFFI bindings for Argon2 arrow 1.3.0 Better dates & times for Python astroid 2.15.8 An abstract syntax tree for Python with inference support. asttokens 2.4.0 Annotate AST trees with source code positions async-lru 2.0.4 Simple LRU cache for asyncio async-timeout 4.0.3 Timeout context manager for asyncio programs asyncpg 0.28.0 An asyncio PostgreSQL driver attrs 23.1.0 Classes Without Boilerplate autoflake 2.2.1 Removes unused imports and unused variables babel 2.13.0 Internationalization utilities backcall 0.2.0 Specifications for callback functions passed in to an API beautifulsoup4 4.12.2 Screen-scraping library black 23.9.1 The uncompromising code formatter. bleach 6.1.0 An easy safelist-based HTML-sanitizing tool. certifi 2023.7.22 Python package for providing Mozilla's CA Bundle. cffi 1.16.0 Foreign Function Interface for Python calling C code. charset-normalizer 3.3.0 The Real First Universal Charset Detector. Open, modern and actively maintained alternative to Chardet. click 8.1.7 Composable command line interface toolkit cmake 3.27.6 CMake is an open-source, cross-platform family of tools designed to build, test and package software comm 0.1.4 Jupyter Python Comm implementation, for usage in ipykernel, xeus-python etc. datasets 2.14.5 HuggingFace community-driven open-source library of datasets debugpy 1.8.0 An implementation of the Debug Adapter Protocol for Python decorator 5.1.1 Decorators for Humans defusedxml 0.7.1 XML bomb protection for Python stdlib modules dill 0.3.7 serialize all of Python exceptiongroup 1.1.3 Backport of PEP 654 (exception groups) executing 2.0.0 Get the currently executing AST node of a frame, and other information fastjsonschema 2.18.1 Fastest Python implementation of JSON schema filelock 3.12.4 A platform independent file lock. fqdn 1.5.1 Validates fully-qualified domain names against RFC 1123, so that they are acceptable to modern bowsers frozenlist 1.4.0 A list-like structure which implements collections.abc.MutableSequence fsspec 2023.6.0 File-system specification ftfy 6.1.1 Fixes mojibake and other problems with Unicode, after the fact greenlet 2.0.2 Lightweight in-process concurrent programming huggingface-hub 0.17.3 Client library to download and publish models, datasets and other repos on the huggingface.co hub idna 3.4 Internationalized Domain Names in Applications (IDNA) invoke 2.2.0 Pythonic task execution ipykernel 6.25.2 IPython Kernel for Jupyter ipython 8.16.1 IPython: Productive Interactive Computing ipython-genutils 0.2.0 Vestigial utilities from IPython ipywidgets 8.1.1 Jupyter interactive widgets isoduration 20.11.0 Operations with ISO 8601 durations isort 5.12.0 A Python utility / library to sort Python imports. jedi 0.19.1 An autocompletion tool for Python that can be used for text editors. jinja2 3.1.2 A very fast and expressive template engine. joblib 1.3.2 Lightweight pipelining with Python functions json5 0.9.14 A Python implementation of the JSON5 data format. jsonpointer 2.4 Identify specific nodes in a JSON document (RFC 6901) jsonschema 4.19.1 An implementation of JSON Schema validation for Python jsonschema-specifications 2023.7.1 The JSON Schema meta-schemas and vocabularies, exposed as a Registry jupyter 1.0.0 Jupyter metapackage. Install all the Jupyter components in one go. jupyter-client 8.3.1 Jupyter protocol implementation and client libraries jupyter-console 6.6.3 Jupyter terminal console jupyter-core 5.3.2 Jupyter core package. A base package on which Jupyter projects rely. jupyter-events 0.7.0 Jupyter Event System library jupyter-lsp 2.2.0 Multi-Language Server WebSocket proxy for Jupyter Notebook/Lab server jupyter-server 2.7.3 The backend—i.e. core services, APIs, and REST endpoints—to Jupyter web applications. jupyter-server-terminals 0.4.4 A Jupyter Server Extension Providing Terminals. jupyterlab 4.0.6 JupyterLab computational environment jupyterlab-pygments 0.2.2 Pygments theme using JupyterLab CSS variables jupyterlab-server 2.25.0 A set of server components for JupyterLab and JupyterLab like applications. jupyterlab-widgets 3.0.9 Jupyter interactive widgets for JupyterLab lazy-object-proxy 1.9.0 A fast and thorough lazy object proxy. lit 17.0.2 A Software Testing Tool mako 1.2.4 A super-fast templating language that borrows the best ideas from the existing templating languages. markdown 3.5 Python implementation of John Gruber's Markdown. markupsafe 2.1.3 Safely add untrusted strings to HTML/XML markup. matplotlib-inline 0.1.6 Inline Matplotlib backend for Jupyter mccabe 0.7.0 McCabe checker, plugin for flake8 mistune 3.0.2 A sane and fast Markdown parser with useful plugins and renderers mpmath 1.3.0 Python library for arbitrary-precision floating-point arithmetic msgpack 1.0.7 MessagePack serializer multidict 6.0.4 multidict implementation multiprocess 0.70.15 better multiprocessing and multithreading in Python mypy-extensions 1.0.0 Type system extensions for programs checked with the mypy type checker. nbclient 0.8.0 A client library for executing notebooks. Formerly nbconvert's ExecutePreprocessor. nbconvert 7.9.2 Converting Jupyter Notebooks nbformat 5.9.2 The Jupyter Notebook format nest-asyncio 1.5.8 Patch asyncio to allow nested event loops networkx 3.1 Python package for creating and manipulating graphs and networks nltk 3.8.1 Natural Language Toolkit notebook 7.0.4 Jupyter Notebook - A web-based notebook environment for interactive computing notebook-shim 0.2.3 A shim layer for notebook traits and config numpy 1.25.2 Fundamental package for array computing in Python nvidia-cublas-cu11 11.10.3.66 CUBLAS native runtime libraries nvidia-cuda-cupti-cu11 11.7.101 CUDA profiling tools runtime libs. nvidia-cuda-nvrtc-cu11 11.7.99 NVRTC native runtime libraries nvidia-cuda-runtime-cu11 11.7.99 CUDA Runtime native Libraries nvidia-cudnn-cu11 8.5.0.96 cuDNN runtime libraries nvidia-cufft-cu11 10.9.0.58 CUFFT native runtime libraries nvidia-curand-cu11 10.2.10.91 CURAND native runtime libraries nvidia-cusolver-cu11 11.4.0.1 CUDA solver native runtime libraries nvidia-cusparse-cu11 11.7.4.91 CUSPARSE native runtime libraries nvidia-nccl-cu11 2.14.3 NVIDIA Collective Communication Library (NCCL) Runtime nvidia-nvtx-cu11 11.7.91 NVIDIA Tools Extension openai 0.28.1 Python client library for the OpenAI API overrides 7.4.0 A decorator to automatically detect mismatch when overriding a method. packaging 23.2 Core utilities for Python packages pandas 2.1.1 Powerful data structures for data analysis, time series, and statistics pandocfilters 1.5.0 Utilities for writing pandoc filters in python parso 0.8.3 A Python Parser pathspec 0.11.2 Utility library for gitignore style pattern matching of file paths. pexpect 4.8.0 Pexpect allows easy control of interactive console applications. pickleshare 0.7.5 Tiny 'shelve'-like database with concurrency support pillow 10.0.1 Python Imaging Library (Fork) platformdirs 3.11.0 A small Python package for determining appropriate platform-specific dirs, e.g. a "user data dir". prometheus-client 0.17.1 Python client for the Prometheus monitoring system. prompt-toolkit 3.0.39 Library for building powerful interactive command lines in Python protobuf 4.24.4 psutil 5.9.5 Cross-platform lib for process and system monitoring in Python. psycopg2 2.9.9 psycopg2 - Python-PostgreSQL Database Adapter ptyprocess 0.7.0 Run a subprocess in a pseudo terminal pure-eval 0.2.2 Safely evaluate AST nodes without side effects pyarrow 13.0.0 Python library for Apache Arrow pycparser 2.21 C parser in Python pydantic 1.10.13 Data validation and settings management using python type hints pyflakes 3.1.0 passive checker of Python programs pygments 2.16.1 Pygments is a syntax highlighting package written in Python. pylint 2.17.7 python code static checker pymdown-extensions 10.3 Extension pack for Python Markdown. pymupdf 1.23.4 A high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents. pymupdfb 1.23.3 MuPDF shared libraries for PyMuPDF. pyperclip 1.8.2 A cross-platform clipboard module for Python. (Only handles plain text for now.) python-dateutil 2.8.2 Extensions to the standard Python datetime module python-dotenv 1.0.0 Read key-value pairs from a .env file and set them as environment variables python-json-logger 2.0.7 A python library adding a json log formatter pytz 2023.3.post1 World timezone definitions, modern and historical pyyaml 6.0.1 YAML parser and emitter for Python pyzmq 25.1.1 Python bindings for 0MQ qtconsole 5.4.4 Jupyter Qt console qtpy 2.4.0 Provides an abstraction layer on top of the various Qt bindings (PyQt5/6 and PySide2/6). ray 2.7.0 Ray provides a simple, universal API for building distributed applications. referencing 0.30.2 JSON Referencing + Python regex 2023.10.3 Alternative regular expression module, to replace re. requests 2.31.0 Python HTTP for Humans. rfc3339-validator 0.1.4 A pure python RFC3339 validator rfc3986-validator 0.1.1 Pure python rfc3986 validator rpds-py 0.10.4 Python bindings to Rust's persistent data structures (rpds) safetensors 0.4.0 scikit-learn 1.3.1 A set of python modules for machine learning and data mining scipy 1.9.3 Fundamental algorithms for scientific computing in Python send2trash 1.8.2 Send file to trash natively under Mac OS X, Windows and Linux sentence-transformers 2.2.2 Multilingual text embeddings sentencepiece 0.1.99 SentencePiece python wrapper setuptools 68.2.2 Easily download, build, install, upgrade, and uninstall Python packages six 1.16.0 Python 2 and 3 compatibility utilities sniffio 1.3.0 Sniff out which async library your code is running under soupsieve 2.5 A modern CSS selector implementation for Beautiful Soup. sqlalchemy 1.4.41 Database Abstraction Library sqlalchemy2-stubs 0.0.2a35 Typing Stubs for SQLAlchemy 1.4 sqlmodel 0.0.8 SQLModel, SQL databases in Python, designed for simplicity, compatibility, and robustness. stack-data 0.6.3 Extract data from python stack frames and tracebacks for informative displays stopit 1.1.2 Timeout control decorator and context managers, raise any exception in another thread sympy 1.12 Computer algebra system (CAS) in Python tenacity 8.2.3 Retry code until it succeeds terminado 0.17.1 Tornado websocket backend for the Xterm.js Javascript terminal emulator library. threadpoolctl 3.2.0 threadpoolctl tiktoken 0.5.1 tiktoken is a fast BPE tokeniser for use with OpenAI's models tinycss2 1.2.1 A tiny CSS parser tokenizers 0.14.1 tomli 2.0.1 A lil' TOML parser tomlkit 0.12.1 Style preserving TOML library torch 2.0.0 Tensors and Dynamic neural networks in Python with strong GPU acceleration torchvision 0.15.1 image and video datasets and models for torch deep learning tornado 6.3.3 Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed. tqdm 4.66.1 Fast, Extensible Progress Meter traitlets 5.11.2 Traitlets Python configuration system transformers 4.34.0 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow triton 2.0.0 A language and compiler for custom Deep Learning operations types-python-dateutil 2.8.19.14 Typing stubs for python-dateutil typing-extensions 4.8.0 Backported and Experimental Type Hints for Python 3.8+ tzdata 2023.3 Provider of IANA time zone data uri-template 1.3.0 RFC 6570 URI Template Processor urllib3 2.0.6 HTTP library with thread-safe connection pooling, file post, and more. wcwidth 0.2.8 Measures the displayed width of unicode strings in a terminal webcolors 1.13 A library for working with the color formats defined by HTML and CSS. webencodings 0.5.1 Character encoding aliases for legacy web content websocket-client 1.6.4 WebSocket client for Python with low level API options wheel 0.41.2 A built-package format for Python widgetsnbextension 4.0.9 Jupyter interactive widgets for Jupyter Notebook wrapt 1.15.0 Module for decorators, wrappers and monkey patching. xxhash 3.4.1 Python binding for xxHash yarl 1.9.2 Yet another URL library

VikParuchuri commented 1 year ago

Hi, this appears to be an issue where the current user doesn't have passwordless access to postgres. Your options are listed here. For the last option (connection string), you can change the connection string in the local.env file, or by setting an env var, like in the README.

luotongml commented 1 year ago

Hi Vik, the solution works. Thanks for the timely help.

To clarify for others, I modified the local.env file by adding a line of

DATABASE_URL="postgresql://{user}:{passwd}@localhost/textbook"

{user}=your user name in postgresql {passwd}=your passwd in postgresql