Open jonwolds opened 1 year ago
Hi @jonwolds , the redis issue usually happens when the path where the redis config file is created does not exist inside the docker container. It's fixed by creating this path by hand.
Many thanks for that. I'm still getting errors in dp-front and dp-back "gunicorn.errors.HaltServer: <HaltServer 'Worker failed to book.' 3>,
Also, the Cannot write to /var/solr as 8983:8983 persists. I could edit the permissions of /var/solr by exec-ing in, but the container stops as soon as it starts, so I'd have to create a new container, I suppose, but then I'm not sure about how to link that back up with docker-compose. Any ideas?
Answering my own question, this worked: (sudo) chown 8939:8938 /mnt/solr-data/solr
I hadn’t realised what the dp-solr container was when I first asked the question.
Awesome! Is already everything working for you?
localhost:5000 page fires up fine, but I guess I need to do a lot of configuration work.
Basically, I just want to load up a couple of the manufactured corpora from paracrawl.eu for my own personal use, so I’ve no need for the authentication system, and I don’t think I’m able to install the Google app because I don’t have access to Google Workspace.
Any tips on the best order to do things in would be very welcome.
Many thanks in advance!
I'm still struggling to get the set up working. The login process appears to work fine, but I then get sent to localhost:5000/search where I get a 500 - Internal Server Error. The log produced (from dp-front) is below
2023-01-26 19:25:53 +0000] [15] [INFO] Booting worker with pid: 15 [2023-01-26 19:26:11,728] ERROR in app: Exception on /search/ [GET] Traceback (most recent call last): File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app response = self.full_dispatch_request() File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request rv = self.handle_user_exception(e) File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception reraise(exc_type, exc_value, tb) File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise raise value File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request rv = self.dispatch_request() File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request return self.view_functionsrule.endpoint File "/opt/dp/front/venv/lib/python3.9/site-packages/flask_login/utils.py", line 272, in decorated_view return func(*args, **kwargs) File "/opt/dp/front/app/blueprints/search/views.py", line 54, in search_view corpus_collection = base_corpus.solr_collection AttributeError: 'NoneType' object has no attribute 'solr_collection'
I guess that's related to the set-up of the solr collection, which is where I'm struggling to follow the deployment instructions. I copied the solr.xml to the directory referenced by docker-composer.yaml and changed the permissions, but the dp-solr log still says:
2023-01-26 19:25:56.480 INFO (main) [] o.a.s.s.CoreContainerProvider Solr Home: /var/solr/data (source: system property: solr.solr.home) 2023-01-26 19:25:56.483 INFO (main) [] o.a.s.c.SolrXmlConfig solr.xml not found in SOLR_HOME, using built-in default
I put a core.properties file there, too, but it references a schema.xml and a solrconfig.xml, which I do not know how to set up (no instructions in the deployment guide).
Also, the deployment section (I may be jumping the gun here) says "Go to the web interface of your Solr instance", but localhost:5000 is the only port open, so I don't really understand what this means.
Any ideas? Many thanks in advance,
Jon
Hi again Jon!
I've been taking a look into the Dockerfiles and, according to https://github.com/paracrawl/corset/blob/master/docker-compose.yaml#L53 , I think the Solr web interface should be reachable at localhost:8090
(or maybe localhost:8090/solr
).
Regarding the missing schema.xml
it's in the root folder of the repository (and also here. As for the solrconfig.xml
I am not 100% confident, but I think it's self-generated by solr.
Thanks for following up, Marta. It's much appreciated!
Here's my progress so far (I won't be working on this for the next week).
I got into the solr web interface by adding
ports:
In the end, I created the new core using: ./solr create -c name-of-your-new-core
I had to exec in to the dp-solr container to do this as my efforts via the web interface were not successful in creating a solrconfig.xml file. This new core then shows up in the solr web interface correctly. It probably needs adjusting using the schema.xml file from the corset directory, too. Changing the permissions (chown 8983:8983) is always necessary, too
I'm still getting an internal server error at localhost:5000/search, but hopefully once I load some data into the core I've created things might improve.
This is the error message I'm getting from dp-front
2023-02-05 11:40:42,481] ERROR in app: Exception on /search/ [GET]
Traceback (most recent call last):
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask_login/utils.py", line 272, in decorated_view
return func(*args, **kwargs)
File "/opt/dp/front/app/blueprints/search/views.py", line 54, in search_view
corpus_collection = base_corpus.solr_collection
AttributeError: 'NoneType' object has no attribute 'solr_collection'
I've tried to upload some data using tmxutils, but that hasn't been successful yet
Hi again! The error suggests that no corpus are registered in the DB (which makes sense because you are having trouble with that :)) What error are you getting when trying to upload data?
Hi again, I've been having a look at this again, and I've now managed to upload data into solr, but I still can't manage to sort out the link between solr and dp-front.
My configuration is most likely wrong, but the information provided is not quite enough to get it working.
The specific error I'm getting in the gunicorn-error.log is:
[2023-03-09 18:37:23 +0000] [13] [INFO] Booting worker with pid: 13
[2023-03-09 18:37:37,216] ERROR in app: Exception on /search/ [GET]
Traceback (most recent call last):
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask_login/utils.py", line 272, in decorated_view
return func(*args, **kwargs)
File "/opt/dp/front/app/blueprints/search/views.py", line 54, in search_view
corpus_collection = base_corpus.solr_collection
AttributeError: 'NoneType' object has no attribute 'solr_collection'
http://localhost:5000/search/ produces a 500 Internal Server Error.
Any ideas?
Cheers, Jon
I got to the next stage and finally managed to get the /search/ page to appear properly by using an INSERT SQL command tailored to the solr collection I had created based on the model in the greyed-out part of the dpdb_initdb.sql file.
Unfortunately, the search function still doesn't find anything. I'm guessing there's more configuration to do with the dpdb tables in postgres.
This is the error message in the gunicorn-error.log (dp-front)
[2023-03-12 17:34:53,188] ERROR in app: Exception on /query/ [GET]
Traceback (most recent call last):
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/opt/dp/front/venv/lib/python3.9/site-packages/flask_login/utils.py", line 272, in decorated_view
return func(*args, **kwargs)
File "/opt/dp/front/app/blueprints/query/views.py", line 32, in query_view
base_corpus = base_corpus_bo.get_base_corpora_by_pair(source_lang.code, target_langs[0].code)[0]
IndexError: list index out of range
Hi Jon, all your errors seem related to the fact that "get_base_corpora_by_pair" is not returning anything. This is probably caused by the DB being empty (or not properly filled by the INSERT you made by hand), or the connection between the front, the back and the DB does not work.
Some hints:
Hi again Marta,
This is what I have in the basecorpora table:
"id" "name" "description" "source_lang" "target_lang" "sentences" "size_mb" "solr_collection" "is_active" "is_highlight"
1 "TMXcore FR-EN" "French English tmx" 12 1 22093 20 "tmxcore" true true
Can you see anything obviously wrontg? tmxcore is the name of the solr core.
Thanks again for your help!
Jon
I can see that the search terms (e.g. charter here) are making it from dp-front to dp-solr, but no hits are displayed. This is the log from dp-solr:
2023-03-14 19:52:12.238 INFO (qtp1622458036-22) [ x:tmxcore] o.a.s.c.S.Request webapp=/solr path=/select params={q=trg:"charter"&hl=true&start=0&hl.fragsize=0&hl.fl=trg&sort=custom_score+desc&rows=50&wt=json} hits=38 status=0 QTime=11
2023-03-14 19:52:38.274 INFO (qtp1622458036-25) [ x:tmxcore] o.a.s.c.S.Request webapp=/solr path=/select params={q=src:"charter"&hl=true&start=0&hl.fragsize=0&hl.fl=src&sort=custom_score+desc&rows=50&wt=json} hits=1 status=0 QTime=1
OK, I think by reversing the order of the languages, so English is first (source) and French the second (target), that initial error is averted. However, the following error is now showing up in the gunicorn-api-error.log in dp-back:
[2023-03-15 21:00:55,678] ERROR in app: Exception on /search [GET]
Traceback (most recent call last):
File "/opt/dp/back/venv/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/opt/dp/back/venv/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/opt/dp/back/venv/lib/python3.7/site-packages/flask_restx/api.py", line 375, in wrapper
resp = resource(*args, **kwargs)
File "/opt/dp/back/venv/lib/python3.7/site-packages/flask/views.py", line 89, in view
return self.dispatch_request(*args, **kwargs)
File "/opt/dp/back/venv/lib/python3.7/site-packages/flask_restx/resource.py", line 44, in dispatch_request
resp = meth(*args, **kwargs)
File "/opt/dp/back/venv/lib/python3.7/site-packages/flask_login/utils.py", line 272, in decorated_view
return func(*args, **kwargs)
File "/opt/dp/back/api/resources/search.py", line 55, in get
return SearchResponse.schema().dump(search_response), 200
File "/opt/dp/back/venv/lib/python3.7/site-packages/dataclasses_json/mm.py", line 343, in dump
dumped = Schema.dump(self, obj, many=many)
File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/schema.py", line 558, in dump
result = self._serialize(processed_obj, many=many)
File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/schema.py", line 523, in _serialize
value = field_obj.serialize(attr_name, obj, accessor=self.get_attribute)
File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/fields.py", line 328, in serialize
return self._serialize(value, attr, obj, **kwargs)
File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/fields.py", line 716, in _serialize
return [self.inner._serialize(each, attr, obj, **kwargs) for each in value]
File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/fields.py", line 716, in <listcomp>
return [self.inner._serialize(each, attr, obj, **kwargs) for each in value]
File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/fields.py", line 583, in _serialize
return schema.dump(nested_obj, many=many)
File "/opt/dp/back/venv/lib/python3.7/site-packages/dataclasses_json/mm.py", line 343, in dump
dumped = Schema.dump(self, obj, many=many)
File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/schema.py", line 558, in dump
result = self._serialize(processed_obj, many=many)
File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/schema.py", line 523, in _serialize
value = field_obj.serialize(attr_name, obj, accessor=self.get_attribute)
File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/fields.py", line 328, in serialize
return self._serialize(value, attr, obj, **kwargs)
File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/fields.py", line 916, in _serialize
ret = self._format_num(value) # type: _T
File "/opt/dp/back/venv/lib/python3.7/site-packages/marshmallow/fields.py", line 891, in _format_num
return self.num_type(value)
TypeError: float() argument must be a string or a number, not 'list'
Hi! Yes, I think that having English first is mandatory.
As for the last error, I had never seen that. I see that, in the error, "flask_login" is mentioned. As mentioned above, you were not using google login. How are you managing authorization and users? Login is needed in search requests (https://github.com/paracrawl/corset/blob/master/back/api/resources/search.py#L18)
Yes, it's a weird error.
I don't think it's linked to login, because that is now working perfectly, My earlier comment was incorrect!
Hi,
I created the docker containers using docker-compose.yaml and got many of the same issues as stated in the now closed issue. There are errors in the dp-front and dp-back logs as well (gunicorn-error.log isn't writable, redis.log -> can't open the log file: No such file or directory). The dp-solr container doesn't actually seem to start up though.
Any help much appreciated.
Jon