Unable to access snapshot repository

APSEPHION commented 7 months ago

Hi, I really liked playing around with T-Ragx and am now trying to self-host. I am trying to set up a local instance of the translation memory according to the guide. However, I can't seem to access o3t0.or.idrivee2-37.com. Full error message:

Error: {"error":{"root_cause":[{"type":"repository_exception","reason":"[public_t_ragx_translation_memory] Could not determine repository generation from root blobs"}],"type":"repository_exception","reason":"[public_t_ragx_translation_memory] Could not determine repository generation from root blobs","caused_by":{"type":"i_o_exception","reason":"Exception when listing blobs by prefix [index-]","caused_by":{"type":"sdk_client_exception","reason":"Unable to execute HTTP request: t-ragx-public.o3t0.or.idrivee2-37.com","caused_by":{"type":"unknown_host_exception","reason":"t-ragx-public.o3t0.or.idrivee2-37.com"}}}},"status":500}

Is the service still available? If not, could you provide documentation on what indexes are required (and sample data, maybe in CSV format). Or could the error be on my side?

rayliuca commented 7 months ago

hi @APSEPHION,

I tried to access the snapshot, and it seems to be available

Did you use the S3 keys in the guide?

Also, could you try setting the S3 endpoint in elasticsearch.yml? i.e.:

https://github.com/rayliuca/T-Ragx-Fossil/blob/044e3d7c7cd824ebabadf2293dc256b74b8c4ed9/elastic_config/elasticsearch.yml#L99

As for examples, I have sample data in parquets and a sample script to upload them to your own Elasticsaerch service https://github.com/rayliuca/T-Ragx/blob/131068827f3c2664e36957bd0f6e65d9bd981ffb/src/t_ragx/scripts/build_demo_elastic_memory_index.py#L10-L23

By default, the name of the index is translation_memory, as you can see here: https://github.com/rayliuca/T-Ragx/blob/131068827f3c2664e36957bd0f6e65d9bd981ffb/src/t_ragx/processors/ElasticInputProcessor.py#L124-L126

But you could use any names for the index (i.e. translation_memory_demo in the demo script)

In my implementation of the index, it has:

_id: hash of based on the original text, used for dedupe
lang_code_1: some text
lang_code_2: some text
lang_code_3: some text
corpus: the corpus where the data is from (as metadata)
id_key: the lang_code of the source text that the hash was calculated on

For example:

{
  "_id": "01ceca8f3c917331867c1b922cc905c63c1a9abd",
  "en": "Then I chatted with the villagers.",
  "ja": "それから村びとと話し合いました。",
  "zh": "於是我開始和村人聊天。",
  "corpus": "NLLB",
  "id_key": "ja"
}

There could be an arbitrary number of languages associated with the record. T-Ragx would search based on the source lang code specified and return the best records with the target lang code

Please let me know if you can access the snapshot

APSEPHION commented 7 months ago

Thanks for the quick response. I configured the keys and endpoint as in your guide. After some more research, I found that the DNS nameserver in the container seems to be misconfigured. It fails to resolve every address I try. I'll look into that, but I don't think it's related to T-Ragx. Kind of weird, because other containers work...

APSEPHION commented 7 months ago

Thanks for the examples! If nothing else works, I'll DIY it.

APSEPHION commented 7 months ago

Ok, I fixed my networking. My new error is

{"error":{"root_cause":[{"type":"repository_exception","reason":"[public_t_ragx_translation_memory] Could not determine repository generation from root blobs"}],"type":"repository_exception","reason":"[public_t_ragx_translation_memory] Could not determine repository generation from root blobs","caused_by":{"type":"i_o_exception","reason":"Exception when listing blobs by prefix [index-]","caused_by":{"type":"amazon_s3_exception","reason":"The Access Key Id you provided does not exist in our records. (Service: Amazon S3; Status Code: 403; Error Code: InvalidAccessKeyId; Request ID: 17C805C44797C21A; S3 Extended Request ID: 894a7c6c-f598-4b31-807c-acb85d440dfd; Proxy: host)"}}},"status":500}

I configured the access keys as provided here

bin/elasticsearch-keystore add s3.client.default.access_key
CG4KwcrNPefWdJcsBIUp

bin/elasticsearch-keystore add s3.client.default.secret_key
Cau5uITwZ7Ke9YHKvWE9cXuTy5chdapBLhqVaI3C

rayliuca commented 7 months ago

hmm... it seems that it's calling Amazon S3 instead of using the custom endpoint. Did you change the config as I suggested?

Also, could you try setting the S3 endpoint in elasticsearch.yml? i.e.: https://github.com/rayliuca/T-Ragx-Fossil/blob/044e3d7c7cd824ebabadf2293dc256b74b8c4ed9/elastic_config/elasticsearch.yml#L99

If you are using Docker, I would suggest you take a look at https://github.com/rayliuca/T-Ragx-Fossil, which is the docker-compose setup I'm using for the public services right now (even though the folder permission is a bit messed up right now, see the debug section in the README for that repo)

APSEPHION commented 7 months ago

Thanks again! Fossil was not working for me, but after some struggle I found that deleting elk_data/elasticsearch/plugins/.installer.<base64_here> helped. I originally added your repository, but it seems like my Dockerfile was flawed and rebuilding removed it from the config... . After getting Fossil to run, the curl command to add the snapshot returned:

{"error":{"root_cause":[{"type":"repository_verification_exception","reason":"[public_t_ragx_translation_memory] path [elastic] is not accessible on master node"}],"type":"repository_verification_exception","reason":"[public_t_ragx_translation_memory] path [elastic] is not accessible on master node","caused_by":{"type":"i_o_exception","reason":"Unable to upload object [elastic/tests-Vyw5DPImSxybEcd1Z2wvag/master.dat] using a single upload","caused_by":{"type":"amazon_s3_exception","reason":"Access Denied. (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 17CA27A1288959A0; S3 Extended Request ID: 894a7c6c-f598-4b31-807c-acb85d440dfd)"}}},"status":500}

BUT after checking in elasticvue I was able to find the snapshot repository and begin to restore it. I'm at 9GB, just how large is it? I also had to disable xpack.security for elasticvue since I could not find the credentials...

APSEPHION commented 7 months ago

Ok, restoring worked!

rayliuca commented 7 months ago

Glad it worked!

For future visitors:

The full index is ~42GB, and the demo version is ~380MB
Regarding the xpack.security setting, you would need to generate the password by docker exec -it into the container and follow the Elastic documentation. The repo doesn't have any password built-in, for security. It's ok to set the xpack.security config to false in the elasticsearch.yml file if you don't care

rayliuca / T-Ragx

Unable to access snapshot repository #7