Open osherenko opened 1 year ago
It works on my side, but you need to provide some parameters for each command such as collection id (see --help) do_listHtrRnn.py --collid XXX
for running an ocr model: do_htrRnn.py
for downloading transcripts, you can use Transkribus_downloader.py
If I run "do_listHtrRnn.py --colid=1219483", I get the same URL exception
Traceback (most recent call last):
File "D:\Downloads\TranskribusPyClient-master\src\TranskribusCommands\do_listHtrRnn.py", line 115, in <module>
doer.run(options.colid,options.dict)
File "D:\Downloads\TranskribusPyClient-master\src\TranskribusCommands\do_listHtrRnn.py", line 75, in run
sColModels = self.listRnns(colid)
File "d:\python310\lib\site-packages\TranskribusPyClient\client.py", line 878, in listRnns
resp.raise_for_status()
File "d:\python310\lib\site-packages\requests\models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: 403 for url: https://transkribus.eu/TrpServer/rest/recognition/1219483/list?prov=CITlab
However, the command
python Transkribus_downloader.py,
works just fine and downloads four files: JPG, pxml, max and json files.
you need to login in and provider credits (see https://github.com/Transkribus/TranskribusPyClient/wiki)
if you use a persistent login before:
do_login.py --persist --login
you need to add --persistent do_listHtrRnn.py --persistent --colid=1219483
The output of do_login.py --persist --login --pwd
- Logging onto Transkribus as --pwd and making a persistent session
403 Client Error: 403 for url: https://transkribus.eu/TrpServer/rest/auth/login
The output of do_listHtrRnn.py --persistent --colid=1219483
Usage: do_listHtrRnn.py
do_listHtrRnn.py: error: no such option: --persistent
my bad it is --persist
by typing python
python src/TranskribusCommands/do_listHtrRnn.py --help Usage: src/TranskribusCommands/do_listHtrRnn.py
List HTR RNN models and dictionaries available in Transkribus. Pass your login/password as options otherwise consider having a Transkribus_credential.py file, which defines a 'login' and a 'pwd' variables. If you need to use a proxy, use the --https_proxy option or set the environment variables HTTPS_PROXY. To use HTTP Basic Auth with your proxy, use the http://user:password@host/ syntax.
Options: --version show program's version number and exit -h, --help show this help message and exit --colid=COLID get models linked to the colid --dict get dictionaries -s SERVER, --server=SERVER Transkribus server URL -l LOGIN, --login=LOGIN Transkribus login (consider storing your credentials in 'transkribus_credentials.py') -p PWD, --pwd=PWD Transkribus password --persist Try using an existing persistent session, or log-in and persists the session. --https_proxy=HTTPS_PROXY proxy, e.g. http://cornillon:8000
It still doesn't work. If I call, I get an exception.
do_listHtrRnn.py -persist --colid=1219483
Traceback (most recent call last):
File "D:\Downloads\TranskribusPyClient-master\src\TranskribusCommands\do_listHtrRnn.py", line 115, in <module>
doer.run(options.colid,options.dict)
File "D:\Downloads\TranskribusPyClient-master\src\TranskribusCommands\do_listHtrRnn.py", line 75, in run
sColModels = self.listRnns(colid)
File "d:\python310\lib\site-packages\TranskribusPyClient\client.py", line 878, in listRnns
resp.raise_for_status()
File "d:\python310\lib\site-packages\requests\models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: 403 for url: https://transkribus.eu/TrpServer/rest/recognition/1219483/list?prov=CITlab
I am using transkribus_credentials.py.
do_login.py --persist --login YOURLOGIN --pwd YOURPASSWORD do_listHtrRnn.py -persist --colid=1219483
or
do_listHtrRnn.py --colid=1219483 --login YOURLOGIN --pwd YOURPASSWORD
I get
D:\Downloads\TranskribusPyClient-master\src\TranskribusCommands>d:\python39\python.exe do_login.py -persist -login xxx -pwd xxx
- Checking Transkribus login as ogin 403 Client Error: 403 for url: https://transkribus.eu/TrpServer/rest/auth/login
D:\Downloads\TranskribusPyClient-master\src\TranskribusCommands>d:\python39\python.exe do_listHtrRnn.py -persist -colid=1219483
Usage: do_listHtrRnn.py
do_listHtrRnn.py: error: no such option: -c
D:\Downloads\TranskribusPyClient-master\src\TranskribusCommands>d:\python39\python.exe do_listHtrRnn.py --colid=1219483 --login xxx --pwd xxx
Traceback (most recent call last):
File "D:\Downloads\TranskribusPyClient-master\src\TranskribusCommands\do_listHtrRnn.py", line 115, in <module>
doer.run(options.colid,options.dict)
File "D:\Downloads\TranskribusPyClient-master\src\TranskribusCommands\do_listHtrRnn.py", line 75, in run
sColModels = self.listRnns(colid)
File "D:\Downloads\TranskribusPyClient-master\src\TranskribusPyClient\client.py", line 878, in listRnns
resp.raise_for_status()
File "d:\python39\lib\site-packages\requests\models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: 403 for url: https://transkribus.eu/TrpServer/rest/recognition/1219483/list?prov=CITlab
--colid and not only one single '-'
D:\Downloads\TranskribusPyClient-master\src\TranskribusCommands>d:\python39\python.exe do_login.py -persist -login xxx -pwd xxx
OK a convention in many python scripts: for command line parameters with several letters you need to use '--' with one letter a single '-' python.exe do_login.py --persist --login xxx --pwd xxx or python.exe do_login.py --persist -l xxx --p xx (! note --persist with 2 '-')
As long as you don't get this output below with do_login: you're not logged in properly:
python.exe do_login.py --persist -l xxx --p xx
Logging onto Transkribus as xxx and making a persistent session
--> .trnskrbs/session.txt
Done
Again use --help to know what is the command line syntax and you will see when '-' must be used and when '--' must be used python.exe do_login.py --help Options: --version show program's version number and exit -h, --help show this help message and exit -s SERVER, --server=SERVER Transkribus server URL -l LOGIN, --login=LOGIN Transkribus login (consider storing your credentials in 'transkribus_credentials.py') -p PWD, --pwd=PWD Transkribus password --persist Try using an existing persistent session, or log-in and persists the session. --https_proxy=HTTPS_PROXY proxy, e.g. http://cornillon:8000
Sorry for the late entry but i only now realized that prov=CITlab may causes the problem as this is no longer supported. And this endpoint is no longer used. Please use this instead: https://transkribus.eu/TrpServer/rest/models/text?prov=PyLaia And you get the details for a model with: https://transkribus.eu/TrpServer/rest/models/text/1234
Thanks! It really makes sense.
I am actually testing TranskribusPyClient and it calls particular REST functions. How can I switch to PyLaia in TranskribusPyClient? Or should I use PyLaia instead?
the proc=Citlab was fixed in the last commit (12 days ago) when htr+ was disabled. just pull the last version.
I pulled the last version. if I run
d:\Python310\python.exe TranskribusPyClient\src\TranskribusCommands\do_listHtrRnn.py --colid=169748
I get the correct output. If I run
d:\Python310\python.exe TranskribusPyClient\src\TranskribusCommands\do_listHtrRnn.py --colid=715112
I get an exception
Traceback (most recent call last): File "E:\Git\TranskribusPyClient\src\TranskribusCommands\do_listHtrRnn.py", line 117, in <module> doer.run(options.colid,options.dict) File "E:\Git\TranskribusPyClient\src\TranskribusCommands\do_listHtrRnn.py", line 75, in run sColModels = self.listRnns(colid) File "E:\Git\TranskribusPyClient\src\TranskribusPyClient\client.py", line 879, in listRnns resp.raise_for_status() File "d:\Python310\lib\site-packages\requests\models.py", line 960, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 403 Client Error: 403 for url: https://transkribus.eu/TrpServer/rest/recognition/715112/list
As you see the only difference in both calls is the collection number.
403 is a permission error: you must provide credentials (-l -p ) or add --persist : do_listHtrRnn.py --colid=169748 --persist (I assume you have access to collection 715112)
it is strange that do_listHtrRnn.py --colid=169748 works without credentials
Weird! This script lists the documents in the first collection, crashes when it tries to list documents in another collection, and lists documents in the first collection once more. I don't know your code but it doesn't seem a problem on the client code. Could you run the script and tell me what output you get!
tc = client.TranskribusClient() tc.auth_login("XXX", "XXX")
print("1: my docs in collection: %s" % [t['title'] for t in tc.listDocsByCollectionId(colId = 169748)] ) # first output
try: print("docs in collection: %s" % [t['title'] for t in tc.listDocsByCollectionId(colId = 715112)] ) except as e: print("Crash!!!", e) # Client Error: 403 for url: https://transkribus.eu/TrpServer/rest/collections/715112/list
print("2: my docs in collection: %s" % [t['title'] for t in tc.listDocsByCollectionId(colId = 169748)] ) # second output
Am Mo., 19. Dez. 2022 um 11:19 Uhr schrieb Hervé Déjean < @.***>:
403 is a permission error: you must provide credentials (-l -p ) or add --persist : do_listHtrRnn.py --colid=169748 --persist (I assume you have access to collection 715112)
it is strange that do_listHtrRnn.py --colid=169748 works without credentials
— Reply to this email directly, view it on GitHub https://github.com/Transkribus/TranskribusPyClient/issues/11#issuecomment-1357417581, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4B2N3O4ZJ3ONLGNBDY2NTLWOAZERANCNFSM6AAAAAAST54GQA . You are receiving this because you authored the thread.Message ID: @.***>
I don't have access rights to these collections so I cannot try. (I will get a 403 error). Are you sure you have access rights to 715112.
After logging in Transkribus Expert Client with my credentials, I can see the 715112 collection and its transcription. So I assume that I have access rights to 715112. However, I am not a creator/owner of this collection and the creator/owner just allowed me to access the collection. How should I proceed?
Am Di., 20. Dez. 2022 um 12:55 Uhr schrieb Hervé Déjean < @.***>:
I don't have access rights to these collections so I cannot try. (I will get a 403 error). Are you sure you have access rights to 715112.
— Reply to this email directly, view it on GitHub https://github.com/Transkribus/TranskribusPyClient/issues/11#issuecomment-1359252120, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4B2N3NJFM36DTBES6PE7T3WOGNCXANCNFSM6AAAAAAST54GQA . You are receiving this because you authored the thread.Message ID: @.***>
Have you tried with one collection you created? But if you can have access to the collection with Transkribus it should work with the python api
-----Original Message----- From: @.> To: @.>; Cc: "Hervé @.>; @.>; Sent: Tue, Dec 20, 2022 14:01 (GMT+01:00) Subject: Re: [Transkribus/TranskribusPyClient] Programmatic extraction of transcriptions (Issue #11)
After logging in Transkribus Expert Client with my credentials, I can see the 715112 collection and its transcription. So I assume that I have access rights to 715112. However, I am not a creator/owner of this collection and the creator/owner just allowed me to access the collection. How should I proceed?
Am Di., 20. Dez. 2022 um 12:55 Uhr schrieb Hervé Déjean < @.***>:
I don't have access rights to these collections so I cannot try. (I will get a 403 error). Are you sure you have access rights to 715112.
— Reply to this email directly, view it on GitHub https://github.com/Transkribus/TranskribusPyClient/issues/11#issuecomment-1359252120, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4B2N3NJFM36DTBES6PE7T3WOGNCXANCNFSM6AAAAAAST54GQA . You are receiving this because you authored the thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
Have you tried with one collection you created?
Yes, I am the creator of collection 169748 and everything works just fine.
But if you can have access to the collection with Transkribus it should work with the python api
It should, but it doesn't work. Everything seems to work in the Transkribus expert client. Unfortunately, I can't compare REST calls in the TranskribusPyClient and the Transkribus Expert Client. Is this information stored in the log file of the java client?
-----Original Message----- From: @.> To: @.>; Cc: "Hervé @.>; @.>; Sent: Tue, Dec 20, 2022 14:01 (GMT+01:00) Subject: Re: [Transkribus/TranskribusPyClient] Programmatic extraction of transcriptions (Issue #11)
After logging in Transkribus Expert Client with my credentials, I can see the 715112 collection and its transcription. So I assume that I have access rights to 715112. However, I am not a creator/owner of this collection and the creator/owner just allowed me to access the collection. How should I proceed?
Am Di., 20. Dez. 2022 um 12:55 Uhr schrieb Hervé Déjean < @.***>:
I don't have access rights to these collections so I cannot try. (I will get a 403 error). Are you sure you have access rights to 715112.
— Reply to this email directly, view it on GitHub < https://github.com/Transkribus/TranskribusPyClient/issues/11#issuecomment-1359252120 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/A4B2N3NJFM36DTBES6PE7T3WOGNCXANCNFSM6AAAAAAST54GQA
. You are receiving this because you authored the thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
— Reply to this email directly, view it on GitHub https://github.com/Transkribus/TranskribusPyClient/issues/11#issuecomment-1359579012, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4B2N3JRBNNPOKKYM4DALPTWOHHGLANCNFSM6AAAAAAST54GQA . You are receiving this because you authored the thread.Message ID: @.***>
I wonder if extracting stored transcriptions for Transcribus images is possible using the Python client.
Is it possible to start OCR using a particular recognition model?
When running particular commands, I get an error. For example, after calling do_listHtrRnn.py, I get