arc53 / DocsGPT

Chatbot for documentation, that allows you to chat with your data. Privately deployable, provides AI knowledge sharing and integrates knowledge into your AI workflow
https://app.docsgpt.cloud/
MIT License
14.87k stars 1.57k forks source link

Training in progress... #436

Closed jamsnrihk closed 3 months ago

jamsnrihk commented 1 year ago

I git clone the last version and try to upload a pdf file, but system hold on training in progress... old version (4 days ago)doesn't have this problem when i upload same pdf file. env file like API_KEY="my openAI api key" EMBEDDINGS_KEY="my openAI api key" API_URL=localhost:7091 FLASK_APP=application/app.py FLASK_DEBUG=true

For OPENAI on Azure

OPENAI_API_BASE=

OPENAI_API_VERSION=

AZURE_DEPLOYMENT_NAME=

AZURE_EMBEDDINGS_DEPLOYMENT_NAME=

cap3

dartpain commented 1 year ago

Hmm cant replicate, can you give more details please, are you launching using docker-compose?

Does it answer questions on default dataset or there are only issues when you try uploading? are you doing it on your local device?

Xiangyang-Foxtel commented 1 year ago

@dartpain Got same issue, here is the logs of container:

 [2023-10-07 09:00:43,124: ERROR/MainProcess] Process 'ForkPoolWorker-2' pid:20 exited with 'signal 9 (SIGKILL)'
docsgpt-worker-1    | [2023-10-07 09:00:43,145: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 9 (SIGKILL) Job: 0.')
docsgpt-worker-1    | Traceback (most recent call last):
docsgpt-worker-1    |   File "/usr/local/lib/python3.10/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
docsgpt-worker-1    |     raise WorkerLostError(
docsgpt-worker-1    | billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL) Job: 0.
docsgpt-backend-1   | [2023-10-07 09:00:43 +0000] [7] [ERROR] Error handling request /api/task_status?task_id=97ea9648-a242-417c-b5e2-dd032986e3cd
docsgpt-backend-1   | Traceback (most recent call last):
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/gunicorn/workers/sync.py", line 136, in handle
docsgpt-backend-1   |     self.handle_request(listener, req, client, addr)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/gunicorn/workers/sync.py", line 179, in handle_request
docsgpt-backend-1   |     respiter = self.wsgi(environ, resp.start_response)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 2552, in __call__
docsgpt-backend-1   |     return self.wsgi_app(environ, start_response)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 2532, in wsgi_app
docsgpt-backend-1   |     response = self.handle_exception(e)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 2529, in wsgi_app
docsgpt-backend-1   |     response = self.full_dispatch_request()
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1826, in full_dispatch_request
docsgpt-backend-1   |     return self.finalize_request(rv)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1845, in finalize_request
docsgpt-backend-1   |     response = self.make_response(rv)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 2157, in make_response
docsgpt-backend-1   |     rv = self.json.response(rv)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/json/provider.py", line 309, in response
docsgpt-backend-1   |     f"{self.dumps(obj, **dump_args)}\n", mimetype=mimetype
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/json/provider.py", line 230, in dumps
docsgpt-backend-1   |     return json.dumps(obj, **kwargs)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/json/__init__.py", line 238, in dumps
docsgpt-backend-1   |     **kw).encode(obj)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/json/encoder.py", line 201, in encode
docsgpt-backend-1   |     chunks = list(chunks)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/json/encoder.py", line 431, in _iterencode
docsgpt-backend-1   |     yield from _iterencode_dict(o, _current_indent_level)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
docsgpt-backend-1   |     yield from chunks
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/json/encoder.py", line 438, in _iterencode
docsgpt-backend-1   |     o = _default(o)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/json/provider.py", line 122, in _default
docsgpt-backend-1   |     raise TypeError(f"Object of type {type(o).__name__} is not JSON serializable")
docsgpt-backend-1   | TypeError: Object of type WorkerLostError is not JSON serializable
docsgpt-backend-1   | [2023-10-07 09:00:43 +0000] [7] [ERROR] Error handling request /api/task_status?task_id=97ea9648-a242-417c-b5e2-dd032986e3cd
docsgpt-backend-1   | Traceback (most recent call last):
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/gunicorn/workers/sync.py", line 136, in handle
docsgpt-backend-1   |     self.handle_request(listener, req, client, addr)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/gunicorn/workers/sync.py", line 179, in handle_request
docsgpt-backend-1   |     respiter = self.wsgi(environ, resp.start_response)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 2552, in __call__
docsgpt-backend-1   |     return self.wsgi_app(environ, start_response)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 2532, in wsgi_app
docsgpt-backend-1   |     response = self.handle_exception(e)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 2529, in wsgi_app
docsgpt-backend-1   |     response = self.full_dispatch_request()
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1826, in full_dispatch_request
docsgpt-backend-1   |     return self.finalize_request(rv)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1845, in finalize_request
docsgpt-backend-1   |     response = self.make_response(rv)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 2157, in make_response
docsgpt-backend-1   |     rv = self.json.response(rv)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/json/provider.py", line 309, in response
docsgpt-backend-1   |     f"{self.dumps(obj, **dump_args)}\n", mimetype=mimetype
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/json/provider.py", line 230, in dumps
docsgpt-backend-1   |     return json.dumps(obj, **kwargs)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/json/__init__.py", line 238, in dumps
docsgpt-backend-1   |     **kw).encode(obj)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/json/encoder.py", line 201, in encode
docsgpt-backend-1   |     chunks = list(chunks)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/json/encoder.py", line 431, in _iterencode
docsgpt-backend-1   |     yield from _iterencode_dict(o, _current_indent_level)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
docsgpt-backend-1   |     yield from chunks
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/json/encoder.py", line 438, in _iterencode
docsgpt-backend-1   |     o = _default(o)
docsgpt-backend-1   |   File "/usr/local/lib/python3.10/site-packages/flask/json/provider.py", line 122, in _default
docsgpt-backend-1   |     raise TypeError(f"Object of type {type(o).__name__} is not JSON serializable")
docsgpt-backend-1   | TypeError: Object of type WorkerLostError is not JSON serializable
dartpain commented 1 year ago

Need to investigate why worker exited.

Could be: Memory Issues: Check if the system is running out of memory. Monitor memory usage when running the application to confirm this. Manually Killed: Ensure that the process is not being manually killed by some other processes or scripts.

Are there any more logs for docsgpt-worker-1 container?

Xiangyang-Foxtel commented 1 year ago

hi @dartpain , thanks for your quick response. Here is the full worker log:

OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
/usr/local/lib/python3.10/site-packages/celery/platforms.py:840: SecurityWarning: You're running the worker with superuser privileges: this is
absolutely not recommended!

Please specify a different user using the --uid option.

User information: uid=0 euid=0 gid=0 egid=0

  warnings.warn(SecurityWarning(ROOT_DISCOURAGED.format(

 -------------- celery@8436a71b938a v5.2.7 (dawn-chorus)
--- ***** -----
-- ******* ---- Linux-5.15.82-0-virt-x86_64-with-glibc2.31 2023-10-07 09:00:08
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app:         application.celery:0x7f088ba6b1c0
- ** ---------- .> transport:   redis://redis:6379/0
- ** ---------- .> results:     redis://redis:6379/1
- *** --- * --- .> concurrency: 2 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
 -------------- [queues]
                .> celery           exchange=celery(direct) key=celery

[tasks]
  . application.api.user.tasks.ingest

[2023-10-07 09:00:09,156: INFO/MainProcess] Connected to redis://redis:6379/0
[2023-10-07 09:00:09,175: INFO/MainProcess] mingle: searching for neighbors
[2023-10-07 09:00:10,198: INFO/MainProcess] mingle: all alone
[2023-10-07 09:00:10,311: INFO/MainProcess] celery@8436a71b938a ready.
[2023-10-07 09:00:37,233: INFO/MainProcess] Task application.api.user.tasks.ingest[97ea9648-a242-417c-b5e2-dd032986e3cd] received
[2023-10-07 09:00:37,237: WARNING/ForkPoolWorker-2] inputs/local/E_EG_441_0081.pdf
[2023-10-07 09:00:37,258: WARNING/ForkPoolWorker-2] <Response [200]>
[2023-10-07 09:00:38,421: WARNING/ForkPoolWorker-2] Grouping small documents
[2023-10-07 09:00:43,124: ERROR/MainProcess] Process 'ForkPoolWorker-2' pid:20 exited with 'signal 9 (SIGKILL)'
[2023-10-07 09:00:43,145: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 9 (SIGKILL) Job: 0.')
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
    raise WorkerLostError(
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL) Job: 0.

worker: Warm shutdown (MainProcess)

Yesterday, it happens in demo website https://docsgpt.arc53.com/ and my local env. But I can not reproduce this issue today, it's working after restart all containers.

[2023-10-08 01:32:03,828: INFO/MainProcess] Connected to redis://redis:6379/0
[2023-10-08 01:32:03,834: INFO/MainProcess] mingle: searching for neighbors
[2023-10-08 01:32:04,850: INFO/MainProcess] mingle: all alone
[2023-10-08 01:32:04,870: INFO/MainProcess] celery@8436a71b938a ready.
[2023-10-08 01:32:41,744: INFO/MainProcess] Task application.api.user.tasks.ingest[d137c53b-ea4b-4579-8142-79ce83a797bc] received
[2023-10-08 01:32:41,747: WARNING/ForkPoolWorker-2] inputs/local/E_EG_441_0081.pdf
[2023-10-08 01:32:41,773: WARNING/ForkPoolWorker-2] <Response [200]>
[2023-10-08 01:32:42,915: WARNING/ForkPoolWorker-2] Grouping small documents
[2023-10-08 01:32:43,282: WARNING/ForkPoolWorker-2] Separating large documents
[2023-10-08 01:32:44,067: INFO/ForkPoolWorker-2] Loading faiss with AVX2 support.
OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
[2023-10-08 01:32:44,119: INFO/ForkPoolWorker-2] Successfully loaded faiss with AVX2 support.
Embedding 🦖:   0%|          | Time Left: ?orker-2]
Embedding 🦖:  20%|##        | Time Left: 00:04r-2]
Embedding 🦖:  40%|####      | Time Left: 00:02r-2]
Embedding 🦖:  60%|######    | Time Left: 00:01r-2]
Embedding 🦖:  80%|########  | Time Left: 00:00r-2]
Embedding 🦖: 100%|##########| Time Left: 00:00r-2]
Embedding 🦖: 100%|##########| Time Left: 00:00r-2]
[2023-10-08 01:32:47,854: INFO/ForkPoolWorker-2] Task application.api.user.tasks.ingest[d137c53b-ea4b-4579-8142-79ce83a797bc] succeeded in 6.10717300000033s: {'directory': 'inputs', 'formats': ['.rst', '.md', '.pdf', '.txt'], 'name_job': 'E_EG_441_0081.pdf', 'filename': 'E_EG_441_0081.pdf', 'user': 'local', 'limited': False}
jamsnrihk commented 1 year ago

@dartpain Alex, I think i found the reason: if the uploaded file name include Chinese characters, the log file show following errors. Same file, if i change file name to english, training smoothly done. May be some encoder need change to support utf-8 code.

023-10-08 22:48:51 docsgpt-backend-1 | f"{self.dumps(obj, dump_args)}\n", mimetype=mimetype 2023-10-08 22:48:51 docsgpt-backend-1 | File "/usr/local/lib/python3.10/site-packages/flask/json/provider.py", line 230, in dumps 2023-10-08 22:48:51 docsgpt-backend-1 | return json.dumps(obj, kwargs) 2023-10-08 22:48:51 docsgpt-backend-1 | File "/usr/local/lib/python3.10/json/init.py", line 238, in dumps 2023-10-08 22:48:51 docsgpt-backend-1 | **kw).encode(obj) 2023-10-08 22:48:51 docsgpt-backend-1 | File "/usr/local/lib/python3.10/json/encoder.py", line 201, in encode 2023-10-08 22:48:51 docsgpt-backend-1 | chunks = list(chunks) 2023-10-08 22:48:51 docsgpt-backend-1 | File "/usr/local/lib/python3.10/json/encoder.py", line 431, in _iterencode 2023-10-08 22:48:51 docsgpt-backend-1 | yield from _iterencode_dict(o, _current_indent_level) 2023-10-08 22:48:51 docsgpt-backend-1 | File "/usr/local/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict 2023-10-08 22:48:51 docsgpt-backend-1 | yield from chunks 2023-10-08 22:48:51 docsgpt-backend-1 | File "/usr/local/lib/python3.10/json/encoder.py", line 438, in _iterencode 2023-10-08 22:48:51 docsgpt-backend-1 | o = _default(o) 2023-10-08 22:48:51 docsgpt-backend-1 | File "/usr/local/lib/python3.10/site-packages/flask/json/provider.py", line 122, in _default 2023-10-08 22:48:51 docsgpt-backend-1 | raise TypeError(f"Object of type {type(o).name} is not JSON serializable") 2023-10-08 22:48:51 docsgpt-backend-1 | TypeError: Object of type IndexError is not JSON serializable

asakura42 commented 1 year ago

Same error at official instance. File name is fully latin (axis.pdf)

dartpain commented 1 year ago

Ok thank you for reporting it, def some encoding issue. Can you please send me a link to the document here or via email alex@arc53.com

LoveYourEnemy commented 1 year ago

I encountered the same issue using https://docsgpt.arc53.com/ pdf name is fully Latin. Tried 2 pdfs. Both did not work

dartpain commented 1 year ago

@LoveYourEnemy Please send me the pdfs, I would appreciate it a lot!

jamsnrihk commented 1 year ago

Ok thank you for reporting it, def some encoding issue. Can you please send me a link to the document here or via email alex@arc53.com

files sent, please kindly check and help.

dartpain commented 1 year ago

Yep, I got them, thank you. I will try to fix a bit later today. Once there is a fix ill update you

Thank you!

jamsnrihk commented 1 year ago

Dear Aelx Any update on this issue?

Best Regards James

On Mon, 9 Oct 2023 at 23:35, Alex @.***> wrote:

Yep, I got them, thank you. I will try to fix a bit later today. Once there is a fix ill update you

Thank you!

— Reply to this email directly, view it on GitHub https://github.com/arc53/DocsGPT/issues/436#issuecomment-1753239018, or unsubscribe https://github.com/notifications/unsubscribe-auth/A76VNQSQBP7CKJEKP6P46JTX6QKVPAVCNFSM6AAAAAA5UNK5COVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONJTGIZTSMBRHA . You are receiving this because you authored the thread.Message ID: @.***>

-- P Help save paper - do you need to print this email?

dartpain commented 1 year ago

@pabik tried replicating the bug but unfortunately with no success. Can you try again please?

LoveYourEnemy commented 1 year ago

@jamsnrihk I tried using a different browser in which I disabled all Adblocks and allowed all scripts. The website has successfully trained the pdf I uploaded and also provided a summary.

jamsnrihk commented 1 year ago

If file name in English, it is no problem to upload and trained, but, if, you rename the file name to Chinese or other double-byte language, it will NOT uploaded and trained.

On Sun, 15 Oct 2023 at 19:09, TZ_Toaster @.***> wrote:

@jamsnrihk https://github.com/jamsnrihk I tried using a different browser in which I disabled all Adblocks and allowed all scripts. The website has successfully trained the pdf I uploaded and also provided summery.

— Reply to this email directly, view it on GitHub https://github.com/arc53/DocsGPT/issues/436#issuecomment-1763355988, or unsubscribe https://github.com/notifications/unsubscribe-auth/A76VNQWFSNHKRUHF4UBRRIDX7PAABAVCNFSM6AAAAAA5UNK5COVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRTGM2TKOJYHA . You are receiving this because you were mentioned.Message ID: @.***>

-- P Help save paper - do you need to print this email?

dartpain commented 1 year ago

@jamsnrihk Thank you for thenote, will try to replicate now

123HAOBO commented 12 months ago

I visited the website docsgpt.arc53.com/ provided by the project, but when the training file appeared stuck in 100% of the interface did not respond to this reason, which big guy to answer 屏幕截图(2)

LoveYourEnemy commented 10 months ago

I was able to train fully Latin named pdf 3 or 4 days ago, but all the sudden they get stuck at 0% (using website) When I refresh the web page when it gets stuck at 0%, I get error message: 404: NOT_FOUND Code: NOT_FOUND ID: fra1::prw7c-1702454727390-1fd3bbf5fdf1

LoveYourEnemy commented 10 months ago

https://github.com/arc53/DocsGPT/issues/490#issuecomment-1751882587 I think this might also be the problem on the website

dartpain commented 9 months ago

Should be resolved now, pelase try again. Therwas a bit of an overflow

huicewang commented 7 months ago

I encountered the same issue in Windows dev environment, but there are no error messages in any of my logs.

dartpain commented 7 months ago

Can you please provide me the file that you are trying to upload? Does it work on the demo? https://docsgpt.arc53.com/

huicewang commented 7 months ago

Yes, it works well on this demo, and there are no error messages in the logs. testcase.pdf

dartpain commented 7 months ago

Demo ingest files the same way the open source version does. Wierd that its not working for you. Does the default file work?

huicewang commented 7 months ago

yes ,chat is ok

dartpain commented 7 months ago

Is this summary relevant? CleanShot 2024-03-07 at 11 29 54@2x

I used latest version.

please walk me through your deployment and any logs that you see in the backend and worker instance.

Do you have a new source doc once training is finished?

huicewang commented 7 months ago

1709963265985

dartpain commented 3 months ago

Should be all fixed in newer versions