daniel-jach / treetag-fertilizer

MIT License
0 stars 0 forks source link

function does not find treetagger #1

Open dphh opened 1 year ago

dphh commented 1 year ago

Hi, thank you very much for creating the treetag.fertilizer function.

Unfortunately, I get an error message when using the function to tag a German corpus. Therefore, I would like to ask for help. Running

treetag.fertilizer(pathToTreeTagger = "C:/TreeTagger/", pathToCorpus = here::here("output", "interim_save.csv"), language = "german") leads to the output: Error in system(systemCmd, intern = TRUE) : 'C:/TreeTagger/cmd/tree-tagger-german' not found

Indeed, there is no German tagging script in "C:/TreeTagger/cmd", so I understand the error message. I am just wondering about this because I closely followed the Tretagger installation manual (downloading it from https://www.cis.lmu.de/~schmid/tools/TreeTagger/), and because I already have successfully tagged corpora using koRpus. So, I wonder: Is there a file missing in my TreeTagger/cmd/ directory and if so, where could I download it?

In the directory TreeTagger/cmd on my machine, there is the file "filter-chunker-output-german.perl", and in the directory TreeTagger/lib, there is the file "german.par". Should pathToTreeTagger lead to either of these files?

I would be grateful for any advice about making treetag.fertilizer work.

Additional note: After some more investigation, it seems that this problem has to do with me using a Windows machine. In the Mac and Linux installation manual, there is a step about downloading tagging scripts. This step is not part of the windows installation manual. So maybe, some modifications are necessary to use the treetag.fertilizer function on windows.

daniel-jach commented 1 year ago

Hi, Thank you for your interest in my function.

From the error you received I would guess that the problem is not with the function but with a file missing in the treetagger installation. Have you completed step 2 of the treetagger installation guide: "2. Download the tagging scripts into the same directory."? That's where you can download the missing files. Please find the missing files attached.

Let me know whether that helped. Best, Daniel

Sent with Proton Mail secure email.

------- Original Message ------- On Monday, November 14th, 2022 at 9:15 PM, dphh @.***> wrote:

Hi, thank you very much for creating the treetag.fertilizer function.

Unfortunately, I get an error message when using the function to tag a German corpus. Therefore, I would like to ask for help. Running

treetag.fertilizer(pathToTreeTagger = "C:/TreeTagger/", pathToCorpus = here::here("output", "interim_save.csv"), language = "german") leads to the output: Error in system(systemCmd, intern = TRUE) : 'C:/TreeTagger/cmd/tree-tagger-german' not found

Indeed, there is no German tagging script in "C:/TreeTagger/cmd", so I understand the error message. I am just wondering about this because I closely followed the Tretagger installation manual (downloading it from https://www.cis.lmu.de/~schmid/tools/TreeTagger/), and because I already have successfully tagged corpora using koRpus. So, I wonder: Is there a file missing in my TreeTagger/cmd/ directory and if so, where could I download it?

In the directory TreeTagger/cmd on my machine, there is the file "filter-chunker-output-german.perl", and in the directory TreeTagger/lib, there is the file "german.par". Should pathToTreeTagger lead to either of these files?

I would be grateful for any advice about making treetag.fertilizer work.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

dphh commented 1 year ago

INSTALL.txt Hi, many thanks for getting back to me. I think that my difficulties using your function on a Windows machine have to do with the Treetagger installation working differently for Windows machines. The tretagger installation guide that you quoted is about installing the treetagger on Linux and Mac machines. The installation guide for Windows machines is not on the website, but I attach it here (Install.txt). It is part of the Windows installation to be downloaded here: https://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/tree-tagger-windows-3.2.3.zip

As an experiment, I downloaded the german parameter file (https://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/german.par.gz) and placed it in the cmd-folder of my treetagger installation. But even then, I still get the same error message as quoted above.

So, I think that I might have to adapt your function to use it on my Windows machine. Do you also think so? Unfortunately, doing this is beyond my current capabilities.

Best,

David

daniel-jach commented 1 year ago

I have tested the function only on Ubuntu.

Try the following. Windows uses backslashes in file paths. Change your pathToTreeTagger to C:\TreeTagger\ (with backslashes instead of foward slashes) and change line 4 of the function from treetagger<-paste(pathToTreeTagger, "cmd/", "tree-tagger-", language, sep = "") to treetagger<-paste(pathToTreeTagger, "cmd\", "tree-tagger-", language, sep = "")

dphh commented 1 year ago

Hi Daniel, many thanks for taking the time to look into this. I think that Windows file paths with forward slashes should work in R. However, I followed your suggestion and replaced slashes with backslashes in order to provide correct Windows file paths (for future reference, R then requires double-backslashes here, e.g. C:\\TreeTagger\\, because it interprets single backslashes as escape characters).

Unfortunately, this results merely in a slight change of the error message that I receive, it now reads: Error in system(systemCmd, intern = TRUE) : 'C:\TreeTagger\cmd\tree-tagger-german' not found.

So, my problem persists. My TreeTagger installation should be fine since koRpus::treetag works (it is just too slow for some of my tasks). I still suspect this might have to do with the different installation process for the Treetagger under Windows. In case you have another idea for a solution, that would be great. As my time to work on my current project is limited, I will have to try to get along without the treetag.fertilizer function for now...

All best, David

Imusten commented 3 months ago

Hi, try correcting the first line in the function as follow:

treetagger<-paste(pathToTreeTagger, "/cmd/", "tree-tagger-", language, sep = "")

The only thing missing was a dash before the first cmd, which leads to an error in finding the path to the language executable.