withanage / HEIDIEditor

HTML5 WYSWIG Editor for NML-XML (Standalone Application or as OMP Plugin)
GNU General Public License v3.0
3 stars 0 forks source link

command should runs as a normal user #40

Closed withanage closed 9 years ago

withanage commented 10 years ago

unoconv -f docx -o /path/to/out_folder/doc/new.docx /path/to/input_file.doc

may- commented 10 years ago

I updated some libraries for unoconv on the remote server. Now I can run the command from the terminal (without sudo), but not from a script.

$ unoconv -f docx -o /path/to/out_folder/doc/new.docx /path/to/input_file.doc

generates new.docx


If I call the following script from the browser, I see the return code 251.

#!/usr/bin/python
import subprocess
cmd = 'unoconv -f docx -o /path/to/out_folder/doc/new.docx /path/to/input_file.doc'
retcode = ''
try:
    subprocess.check_call(cmd, stdin=None, shell=True)
except subprocess.CalledProcessError, (e):
    retcode = e
print 'Content-type: text/html\n'
print retcode

So far as I know, return code 251 means a writing error. I set full permission to /path/to/out_folder/doc manually, but nothing has changed. What should I do to solve this problem?

withanage commented 10 years ago

check if the user who owns the apache process can write to the folder?

withanage commented 10 years ago

try to put set the folder rights in the same group as apache and give writing rights?

MartinPaulEve commented 10 years ago

Just to clarify: this usage is not required. You can run meTypeset with the "doc" option and it will handle the unoconv internally.

On 10/10/14 10:01, Dulip Withanage wrote:

unoconv -f docx -o /path/to/out_folder/doc/new.docx /path/to/input_file.doc

— Reply to this email directly or view it on GitHub https://github.com/withanage/HEIDIEditor/issues/40.

Dr. Martin Paul Eve Lecturer in English Literature University of Lincoln

E: meve@lincoln.ac.uk W: https://www.martineve.com

Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit: Writing Around Pynchon (https://www.pynchon.net) Web Editor, Alluvium, (http://www.alluvium-journal.org)

may- commented 10 years ago

@MartinPaulEve , Thank you for the comment.

When I trigger the meTypeset with input format "doc" from the script, (I mean, not from the command line,) meTypeset returns an error and aborts the process. I figured out the internal handling of unoconv causes this error. I rewrote the line 46 of meTypeset/bin/unoconvtodocx.py:

#subprocess.call(unoconv_command, stdin=None, shell=True)
try:
      subprocess.check_call(unoconv_command, stdin=None, shell=True)
except subprocess.CalledProcessError, (e):
      self.debug.print_debug(self, u'unoconv error: {0}'.format(e))

then, returned: Command 'unoconv -f docx -o /path/to/out_folder/doc/new.docx /path/to/input_file.doc' returned non-zero exit status 251


@withanage Dulip, the output folder has the owner "www-data" and the group "www-data" (apache process), because the folder is created by meTypeset. Even though I manually set the permission 777 to the output folder, the same error still occurs.

MartinPaulEve commented 10 years ago

Hmmm, curious. I wonder what that is? Seems like an environment problem... Does the command work if you run manually from bash?

On 13/10/14 10:08, may ohta wrote:

@MartinPaulEve https://github.com/MartinPaulEve , Thank you for the comment.

When I trigger the meTypeset with input format "doc" from the script, (I mean, not from the command line,) meTypeset returns an error and aborts the process. I figured out the internal handling of unoconv causes this error. I rewrote the line 46 of meTypeset/bin/unoconvtodocx.py:

subprocess.call(unoconv_command, stdin=None, shell=True)

try: subprocess.check_call(unoconv_command, stdin=None, shell=True) except subprocess.CalledProcessError, (e): self.debug.print_debug(self, u'unoconv error: {0}'.format(e))

then, returned: |Command 'unoconv -f docx -o /path/to/out_folder/doc/new.docx /path/to/input_file.doc' returned non-zero exit status 251|


@withanage https://github.com/withanage Dulip, the output folder has the owner "www-data" and the group "www-data" (apache process), because the folder is created by meTypeset. Even though I manually set the permission 777 to the output folder, the same error still occurs.

— Reply to this email directly or view it on GitHub https://github.com/withanage/HEIDIEditor/issues/40#issuecomment-58864685.

Dr. Martin Paul Eve Lecturer in English Literature University of Lincoln

E: meve@lincoln.ac.uk W: https://www.martineve.com

Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit: Writing Around Pynchon (https://www.pynchon.net) Web Editor, Alluvium, (http://www.alluvium-journal.org)

may- commented 10 years ago

Hi @MartinPaulEve ,

Hmmm, curious. I wonder what that is? Seems like an environment problem...

yeah, definitely....

Does the command work if you run manually from bash?

Do you mean the unoconv command? then yes.

$ which unoconv
/usr/bin/unoconv
$ unoconv --version
unoconv 0.6
Written by Dag Wieers 
Homepage at http://dag.wieers.com/home-made/unoconv/
platform posix/linux2
python 2.7.3 (default, Feb 27 2014, 19:58:35) 
[GCC 4.6.3]
LibreOffice 3.5
$ unoconv -f docx -o /path/to/out_folder/doc/new.docx /path/to/input_file.doc
## this command creates `new.docx` properly.

permission:

$ ls -al /path/to
drwxrwxrwx 7 www-data www-data  4096 Okt  10 13:37 ./
drwxrwxrwx 3 www-data www-data  4096 Okt  10 15:04 ../
-rw-rw-rw- 1 www-data www-data 22485 Okt  10 15:02 input_file.doc
drwxr-xr-x 6 www-data www-data  4096 Okt  10 15:07 out_folder/
MartinPaulEve commented 10 years ago

Hmmm.

Is this a "command not found" error? Is /usr/bin definitely on the shell env path for www-data user?

On 13/10/14 11:41, may ohta wrote:

Hi @MartinPaulEve https://github.com/MartinPaulEve ,

Hmmm, curious. I wonder what that is? Seems like an environment
problem...

yeah, definitely....

Does the command work if you run manually from bash?

Do you mean the unoconv command? then yes.

$ which unoconv /usr/bin/unoconv $ unoconv --version unoconv 0.6 Written by Dag Wieers Homepage at http://dag.wieers.com/home-made/unoconv/

platform posix/linux2 python 2.7.3 (default, Feb 27 2014, 19:58:35) [GCC 4.6.3] LibreOffice 3.5 $ unoconv -f docx -o /path/to/out_folder/doc/new.docx /path/to/input_file.doc

this command creates new.docx properly.

permission:

$ ls -al /path/to drwxrwxrwx 7 www-data www-data 4096 Okt 10 13:37 ./ drwxrwxrwx 3 www-data www-data 4096 Okt 10 15:04 ../ -rw-rw-rw- 1 www-data www-data 22485 Okt 10 15:02 input_file.doc drwxr-xr-x 6 www-data www-data 4096 Okt 10 15:07 out_folder/

— Reply to this email directly or view it on GitHub https://github.com/withanage/HEIDIEditor/issues/40#issuecomment-58874522.

Dr. Martin Paul Eve Lecturer in English Literature University of Lincoln

E: meve@lincoln.ac.uk W: https://www.martineve.com

Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit: Writing Around Pynchon (https://www.pynchon.net) Web Editor, Alluvium, (http://www.alluvium-journal.org)

may- commented 10 years ago

Is this a "command not found" error?

No, meTypeset just says:

   ....
  File "/path/to/meTypeset/bin/docxtotei.py", line 128, in run
    with zipfile.ZipFile(self.gv.input_file_path, "r") as z:
IOError: [Errno 2] No such file or directory: '/path/to/output_folder/doc/new.docx'

unoconv error:

Error: Unable to connect or start own listener. Aborting.

Is /usr/bin definitely on the shell env path for www-data user?

Yes. /etc/init.d/apache2 has the line:

ENV="env -i LANG=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
MartinPaulEve commented 10 years ago

But that meTypeset error is because it expects, after unoconv runs, to find the new .docx file. If it can't find unoconv, then the .docx won't exist and this could cause the problem...

On 13/10/14 16:16, may ohta wrote:

Is this a "command not found" error?

No, meTypeset just says:

....

File "/path/to/meTypeset/bin/docxtotei.py", line 128, in run with zipfile.ZipFile(self.gv.input_file_path, "r") as z: IOError: [Errno 2] No such file or directory: '/path/to/output_folder/doc/new.docx'

unoconv error:

Error: Unable to connect or start own listener. Aborting.


Is /usr/bin definitely on the shell env path for www-data user?

Yes. |/etc/init.d/apache2| has the line:

ENV="env -i LANG=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

— Reply to this email directly or view it on GitHub https://github.com/withanage/HEIDIEditor/issues/40#issuecomment-58906970.

Dr. Martin Paul Eve Lecturer in English Literature University of Lincoln

E: meve@lincoln.ac.uk W: https://www.martineve.com

Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit: Writing Around Pynchon (https://www.pynchon.net) Web Editor, Alluvium, (http://www.alluvium-journal.org)

may- commented 10 years ago

I know. I just mean, it isn't a "command not found" error, because unoconv returns "Error: Unable to connect or start own listener. Aborting.", and that's why meTypeset can't find the file.

MartinPaulEve commented 10 years ago

Ah, OK, yes. True.

Can you pipe stdout and stderr to a file to debug?

On 13/10/14 16:22, may ohta wrote:

I know. I just mean, it isn't a "command not found" error, because unoconv returns "Error: Unable to connect or start own listener. Aborting.", and that's why meTypeset can't find the file.

— Reply to this email directly or view it on GitHub https://github.com/withanage/HEIDIEditor/issues/40#issuecomment-58907832.

Dr. Martin Paul Eve Lecturer in English Literature University of Lincoln

E: meve@lincoln.ac.uk W: https://www.martineve.com

Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit: Writing Around Pynchon (https://www.pynchon.net) Web Editor, Alluvium, (http://www.alluvium-journal.org)

may- commented 10 years ago

OK, I'll try. Please wait...

may- commented 10 years ago

Sorry, It took a long time to find out how to check stdout and stderr in case of a cgi script, and I saw, nonetheless, nothing new.

stdout

Traceback (most recent call last):
  File "/path/to/test.py", line 40, in 
    log = meTypeset.test(opt)
  File "/path/to/meTypeset/bin/meTypeset.py", line 261, in test
    me_typeset_instance.run()
  File "/path/to/meTypeset/bin/meTypeset.py", line 246, in run
    self.run_modules()
  File "/path/to/meTypeset/bin/meTypeset.py", line 143, in run_modules
    DocxToTei(self.gv).run(True, self.args['--proprietary'])
  File "/path/to/meTypeset/bin/docxtotei.py", line 128, in run
    with zipfile.ZipFile(self.gv.input_file_path, "r") as z:
  File "/usr/lib/python2.7/zipfile.py", line 756, in __init__
    self.fp = open(file, modeDict[mode])
IOError: [Errno 2] No such file or directory: '/path/to/out_folder/doc/new.docx'

stderr (apache2/error.log)

[Mon Oct 13 18:35:48.513337 2014] [cgi:error] [pid 5309] [client 127.0.0.1:42403] AH01215: Error: Unable to connect or start own listener. Aborting.

my test cgi script test.py:

#!/usr/bin/python
# -*- coding: utf-8 -*-
import sys
sys.path.append('/path/to/meTypeset/bin')
import meTypeset
import shutil
#import cgitb()
#cgitb.enable()
sys.stderr = sys.stdout
print 'Content-type: text/html\n'
if os.path.exists('/path/to/out_folder'):
    shutil.rmtree('/path/to/out_folder')
opt = 'doc /path/to/input_file.doc /path/to/out_folder -d'
log = meTypeset.test(opt)
print
print input
print log

Note: I changed meTypeset little bit, so that it can be imported as a package. (my fork: https://github.com/may-/meTypeset) If I call this script test.py from the command line, it goes well. The problem occurs when I call the script from the browser.

MartinPaulEve commented 10 years ago

Hmmm.

What is the actual path of /path/to/out_folder/doc/ ? Is it specified absolutely or relatively?

Can you enable the debug flag so that we can see more of what meTypeset is doing?

Best wishes,

Martin

On 13/10/14 18:01, may ohta wrote:

Sorry, It took a long time to find out how to check stdout and stderr in case of a cgi script, and I saw, nonetheless, nothing new.

stdout

Traceback (most recent call last): File "/path/to/test.py", line 40, in log = meTypeset.test(opt) File "/path/to/meTypeset/bin/meTypeset.py", line 261, in test me_typeset_instance.run() File "/path/to/meTypeset/bin/meTypeset.py", line 246, in run self.run_modules() File "/path/to/meTypeset/bin/meTypeset.py", line 143, in run_modules DocxToTei(self.gv).run(True, self.args['--proprietary']) File "/path/to/meTypeset/bin/docxtotei.py", line 128, in run with zipfile.ZipFile(self.gv.input_file_path, "r") as z: File "/usr/lib/python2.7/zipfile.py", line 756, in init self.fp = open(file, modeDict[mode]) IOError: [Errno 2] No such file or directory: '/path/to/out_folder/doc/new.docx'

stderr (apache2/error.log)

[Mon Oct 13 18:35:48.513337 2014] [cgi:error] [pid 5309] [client 127.0.0.1:42403] AH01215: Error: Unable to connect or start own listener. Aborting.

my test cgi script |test.py|:

!/usr/bin/python

-- coding: utf-8 --

import sys sys.path.append('/path/to/meTypeset/bin') import meTypeset import shutil

import cgitb()

cgitb.enable()

sys.stderr = sys.stdout print 'Content-type: text/html\n'

if os.path.exists('/path/to/out_folder'): shutil.rmtree('/path/to/out_folder') opt = 'doc /path/to/input_file.doc /path/to/out_folder -d' log = meTypeset.test(opt)

print print input print log

Note: I changed meTypeset little bit, so that it can be imported as a package. (my fork: https://github.com/may-/meTypeset) If I call this script |test.py| from the command line, it goes well. The problem occurs when I call the script from the browser.

— Reply to this email directly or view it on GitHub https://github.com/withanage/HEIDIEditor/issues/40#issuecomment-58922339.

Dr. Martin Paul Eve Lecturer in English Literature University of Lincoln

E: meve@lincoln.ac.uk W: https://www.martineve.com

Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit: Writing Around Pynchon (https://www.pynchon.net) Web Editor, Alluvium, (http://www.alluvium-journal.org)

may- commented 10 years ago

Hello Martin,

I use the absolute path: /home/m1o/public_html/HEIDIEditor/testdata/out_folder. dir structure:

$ tree /home/m1o/public_html
/home/m1o/public_html
├─ HEIDIEditor
│ ├─ cgi
│ │ └─ test.py
│ └─ testdata
│   ├─ in_file.doc
│   └─ out_folder
└─ meTypeset
$ ls -al /home/m1o/public_html/HEIDIEditor/testdata
drwxrwxrwx  3 www-data www-data    4096 Okt 14 10:33 ./
drwxrwxr-x 10 m1o      m1o         4096 Okt  9 20:05 ../
-rw-rw-r--  1 www-data www-data  202752 Okt 14 10:15 in_file.doc
drwxrwxrwx  6 www-data www-data    4096 Okt 14 10:57 out_folder/

test cgi script (url: server_host/~m1o/HEIDIEditor/cgi/test.py; browser: firefox, chrome):

#!/usr/bin/python
import sys
import os
sys.path.append(os.path.abspath('/../../meTypeset/bin'))
import meTypeset
import shutil
print 'Content-type: text/html; charset=utf-8\n'
meTypeset_dir = os.path.abspath('../../meTypeset/bin')
sys.path.append(meTypeset_dir)
import meTypeset
out_dir = os.path.abspath('../testdata/out_folder')
in_file = os.path.abspath('../testdata/in_file.doc')
if os.path.exists(out_dir):
    shutil.rmtree(out_dir)
opt = 'doc '+in_file+' '+out_dir+' -d'
print meTypeset_dir+'/meTypeset.py '+opt
print '*'*20
log = meTypeset.test(opt)
print

stdout (with debug log):

/home/m1o/public_html/meTypeset/bin/meTypeset.py doc /home/m1o/public_html/HEIDIEditor/testdata/in_file.doc /home/m1o/public_html/HEIDIEditor/testdata/out_folder -d
********************
[Main] Running at aggression level 10 [grrr!]
[Main] Metadata file wasn't specified. Falling back to /home/m1o/public_html/meTypeset/metadata/metadataSample.xml
[UNOCONV to DOCX] Running unoconv transform (DOC->DOCX)
[UNOCONV to DOCX] unoconv error: Command 'unoconv -f docx -o /home/m1o/public_html/HEIDIEditor/testdata/out_folder/doc/new.docx /home/m1o/public_html/HEIDIEditor/testdata/in_file.doc' returned non-zero exit status 251
[DOCX to TEI] Unzipping /home/m1o/public_html/HEIDIEditor/testdata/out_folder/doc/new.docx to /home/m1o/public_html/HEIDIEditor/testdata/out_folder/docx

I think it is not an error with meTypeset, but unoconv, because the following cgi script also fails (as I commented 4 days ago). unoconvtest.py:

#!/usr/bin/python
import subprocess
out_dir = '/home/m1o/public_html/HEIDIEditor/testdata'
cmd = ['unoconv', '-f', 'docx', '-o', out_dir+'/out_folder/doc/new.docx', out_dir+'/in_file.doc']
cmd = ' '.join(cmd)
retcode = 'success!'
try:
    subprocess.check_call(cmd, stdin=None, shell=True)
except subprocess.CalledProcessError, (e):
    retcode = e
print 'Content-type: text/html\n'
print retcode
MartinPaulEve commented 10 years ago

Hmm. I think that Alex might have some insight on this from the web server setup that he put together...

On 14/10/14 10:17, may ohta wrote:

Hello Martin,

I use the absolute path: |/home/m1o/public_html/HEIDIEditor/testdata/out_folder|. dir structure:

$ tree /home/m1o/public_html /home/m1o/public_html ├─ HEIDIEditor │ ├─ cgi │ │ └─ test.py │ └─ testdata │ ├─ in_file.doc │ └─ out_folder └─ meTypeset $ ls -al /home/m1o/public_html/HEIDIEditor/testdata drwxrwxrwx 3 www-data www-data 4096 Okt 14 10:33 ./ drwxrwxr-x 10 m1o m1o 4096 Okt 9 20:05 ../ -rw-rw-r-- 1 www-data www-data 202752 Okt 14 10:15 in_file.doc drwxrwxrwx 6 www-data www-data 4096 Okt 14 10:57 out_folder/

test cgi script:

!/usr/bin/python

import sys import os sys.path.append(os.path.abspath('/../../meTypeset/bin')) import meTypeset import shutil

print 'Content-type: text/html; charset=utf-8\n'

meTypeset_dir = os.path.abspath('../../meTypeset/bin') sys.path.append(meTypeset_dir) import meTypeset out_dir = os.path.abspath('../testdata/out_folder') in_file = os.path.abspath('../testdata/in_file.doc') if os.path.exists(out_dir): shutil.rmtree(out_dir) opt = 'doc '+in_file+' '+out_dir+' -d' print meTypesetdir+'/meTypeset.py '+opt print ''_20 log = meTypeset.test(opt)

print

stdout (with debug log):

/home/m1o/public_html/meTypeset/bin/meTypeset.py doc /home/m1o/public_html/HEIDIEditor/testdata/in_file.doc /home/m1o/public_html/HEIDIEditor/testdata/out_folder -d


[[31;01mMain[39;49;00m] Running at aggression level 10 [grrr!] [[31;01mMain[39;49;00m] Metadata file wasn't specified. Falling back to /home/m1o/public_html/meTypeset/metadata/metadataSample.xml [[31;01mUNOCONV to DOCX[39;49;00m] Running unoconv transform (DOC->DOCX) [[31;01mUNOCONV to DOCX[39;49;00m] unoconv error: Command 'unoconv -f docx -o /home/m1o/public_html/HEIDIEditor/testdata/out_folder/doc/new.docx /home/m1o/public_html/HEIDIEditor/testdata/in_file.doc' returned non-zero exit status 251 [[31;01mDOCX to TEI[39;49;00m] Unzipping /home/m1o/public_html/HEIDIEditor/testdata/out_folder/doc/new.docx to /home/m1o/public_html/HEIDIEditor/testdata/out_folder/docx

I think it is not an error with meTypeset, but unoconv, because the following cgi script also fails (as I commented 4 days ago). unoconvtest.py:

!/usr/bin/python

import subprocess

out_dir = '/home/m1o/public_html/HEIDIEditor/testdata' cmd = ['unoconv', '-f', 'docx', '-o', out_dir+'/out_folder/doc/new.docx', out_dir+'/in_file.doc'] cmd = ' '.join(cmd) retcode = 'success!' try: subprocess.check_call(cmd, stdin=None, shell=True) except subprocess.CalledProcessError, (e): retcode = e print 'Content-type: text/html\n' print retcode

— Reply to this email directly or view it on GitHub https://github.com/withanage/HEIDIEditor/issues/40#issuecomment-59012114.

Dr. Martin Paul Eve Lecturer in English Literature University of Lincoln

E: meve@lincoln.ac.uk W: https://www.martineve.com

Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit: Writing Around Pynchon (https://www.pynchon.net) Web Editor, Alluvium, (http://www.alluvium-journal.org)

axfelix commented 10 years ago

I just checked on our server, and while I can't quite remember if we did this for a reason or if was just a matter of what happened to be in Ubuntu sources, it looks like the version of unoconv on the path (the one that's called by meTypeset, which afaik doesn't specify a path) is only 0.4 (which works fine with LibreOffice 4.3.2.2 on there). We have 0.6 installed separately, as we've set the webservice up to use a /vendor subdirectory to manage the various external libs and we wanted to have a separate copy in there; it too works fine, though it may be relevant that we're only using 0.6 to convert doc to docx (when unoconv is called directly by the webservice and the path is specified) and 0.4 to convert wmf to png (when it's called by meTypeset).

may- commented 10 years ago

Hello, thank you for the comment.

On our server, only one unoconv is installed (and only one libreoffice). I checked the version of it.

$ sudo -u www-data which unoconv
/usr/bin/unoconv
$ sudo -u www-data unoconv --version
unoconv 0.6
Written by Dag Wieers 
Homepage at http://dag.wieers.com/home-made/unoconv/
platform posix/linux
python 2.7.3 (default, Feb 27 2014, 19:58:35)
[GCC 4.8.2]
LibreOffice 4.2

I also confirmed that meTypeset uses this version of unoconv. (I added "unoconv --version" and "which unoconv" command to meTypeset/bin/unoconvtodocx.py, and it returned the exactly same message as above.)

I tried the version 0.4 on another environment, but it just returns "docx undefined" error, because it doesn't support the docx format.

unoconv debug log says:

$ sudo -u www-data unoconv -f docx -o out_file.docx -vvv in_file.doc 
Verbosity set to level 3
Using office base path: /usr/lib/libreoffice
Using office binary path: /usr/lib/libreoffice/program
DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext
DEBUG: Existing listener not found.
DEBUG: Launching our own listener using /usr/lib/libreoffice/program/soffice.bin.
LibreOffice listener successfully started. (pid=8542)
Failed to connect to /usr/lib/libreoffice/program/soffice.bin (pid=8542) in 6 seconds.
Connector : couldn't connect to socket (Success)
Error: Unable to connect or start own listener. Aborting.

I read the trouble shooting section "Problems running unoconv from Apache/PHP" of the unoconv's README, and these comments: https://github.com/dagwieers/unoconv/issues/87 but I didn't get anything, actually...

What should I do now? Please help me!!

MartinPaulEve commented 10 years ago

Does /usr/lib/libreoffice/program/soffice.bin exist?

On 15/10/14 16:26, may ohta wrote:

Hello, thank you for the comment.

On our server, only one unoconv is installed (and only one libreoffice). I checked the version of it.

$ sudo -u www-data which unoconv /usr/bin/unoconv $ sudo -u www-data unoconv --version unoconv 0.6 Written by Dag Wieers Homepage at http://dag.wieers.com/home-made/unoconv/

platform posix/linux python 2.7.3 (default, Feb 27 2014, 19:58:35) [GCC 4.8.2] LibreOffice 4.2

I also confirmed that meTypeset uses this version of unoconv. (I added "unoconv --version" and "which unoconv" command to |meTypeset/bin/unoconvtodocx.py|, and it returned the exactly same message as above.)

I tried the version 0.4 on another environment, but it just returns "docx undefined" error, because it doesn't support the docx format.

unoconv debug log says:

$ sudo -u www-data unoconv -f docx -o out_file.docx -vvv in_file.doc Verbosity set to level 3 Using office base path: /usr/lib/libreoffice Using office binary path: /usr/lib/libreoffice/program DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext DEBUG: Existing listener not found. DEBUG: Launching our own listener using /usr/lib/libreoffice/program/soffice.bin. LibreOffice listener successfully started. (pid=8542) Failed to connect to /usr/lib/libreoffice/program/soffice.bin (pid=8542) in 6 seconds. Connector : couldn't connect to socket (Success) Error: Unable to connect or start own listener. Aborting.

I read the trouble shooting section "Problems running unoconv from Apache/PHP" of the unoconv's README, and this comments: dagwieers/unoconv#87 https://github.com/dagwieers/unoconv/issues/87 but I didn't get anything, actually...

What should I do now? Please help me!!

— Reply to this email directly or view it on GitHub https://github.com/withanage/HEIDIEditor/issues/40#issuecomment-59224018.

Dr. Martin Paul Eve Lecturer in English Literature University of Lincoln

E: meve@lincoln.ac.uk W: https://www.martineve.com

Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit: Writing Around Pynchon (https://www.pynchon.net) Web Editor, Alluvium, (http://www.alluvium-journal.org)

may- commented 10 years ago

Does /usr/lib/libreoffice/program/soffice.bin exist?

Yes.

$ ls -al /usr/lib/libreoffice/program/ | grep soffice
-rwxr-xr-x  1 root root     5645 Aug 28 16:55 soffice
-rwxr-xr-x  1 root root     6240 Aug 28 19:43 soffice.bin
-rw-r--r--  1 root root      789 Aug 28 21:05 sofficerc

and here is the debug log called by a normal user: (it goes well...)

$ unoconv -f docx -o /home/m1o/Desktop/out_file.docx -vvv /home/m1o/Desktop/in_file.doc
Verbosity set to level 3
Using office base path: /usr/lib/libreoffice
Using office binary path: /usr/lib/libreoffice/program
DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext
DEBUG: Existing listener not found.
DEBUG: Launching our own listener using /usr/lib/libreoffice/program/soffice.bin.
LibreOffice listener successfully started. (pid=8712)
Input file: /home/m1o/Desktop/out_file.doc
Selected output format: Microsoft Office Open XML [.docx]
Selected office filter: Office Open XML Text
Used doctype: document
Output file: /home/m1o/Desktop/out.docx
DEBUG: Terminating LibreOffice instance.
DEBUG: Waiting for LibreOffice instance to exit.
MartinPaulEve commented 10 years ago

OK, so it's nothing to do with PHP, apache or anything else. It's the user www-data that is the problem.

What happens if you try to run /usr/lib/libreoffice/program/soffice.bin as www-data?

@axfelix: any further thoughts?

axfelix commented 10 years ago

I agree it sounds like a www-data permissions issue -- what distro is this? Could it be some weird SElinux business that's specifically preventing the www-data user from starting the listener service on 2002? I really hate SElinux and like to blame it for arbitrary issues :)

may- commented 10 years ago

What happens if you try to run /usr/lib/libreoffice/program/soffice.bin as www-data?

The console shows "No protocol specified" x 2 ...

$ sudo -u www-data /usr/lib/libreoffice/program/soffice.bin 
No protocol specified
No protocol specified

.. and hangs up.

MartinPaulEve commented 10 years ago

Hmm, I don't know what the syntax is to setup a listener.

Additional suggestions from the unoconv bug thread:

Change www-data's shell to /bin/bash instead of nologin.

Change www-data's home directory to other than /root using PUTENV.

Also, could dump output of this command to see what the difference is between environments:

diff <(sudo -E -H -u www-data env) <(sudo -E -H -u root env)

On 15/10/14 17:02, may ohta wrote:

What happens if you try to run
/usr/lib/libreoffice/program/soffice.bin as www-data?

The console shows "No protocol specified" x 2 ...

$ sudo -u www-data /usr/lib/libreoffice/program/soffice.bin No protocol specified No protocol specified

.. and hangs up.

— Reply to this email directly or view it on GitHub https://github.com/withanage/HEIDIEditor/issues/40#issuecomment-59229983.

Dr. Martin Paul Eve Lecturer in English Literature University of Lincoln

E: meve@lincoln.ac.uk W: https://www.martineve.com

Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit: Writing Around Pynchon (https://www.pynchon.net) Web Editor, Alluvium, (http://www.alluvium-journal.org)

may- commented 10 years ago
$ finger www-data
Login: www-data                 Name: www-data
Directory: /var/www                     Shell: /usr/sbin/nologin
Never logged in.
No mail.
No Plan.
$ sudo chsh www-data
Changing the login shell for www-data
Enter the new value, or press ENTER for the default
    Login Shell [/usr/sbin/nologin]: /bin/bash
$ finger www-data
Login: www-data                 Name: www-data
Directory: /var/www                     Shell: /bin/bash
Never logged in.
No mail.
No Plan.
$ diff < (sudo -E -H -u www-data env) < (sudo -E -H -u root env)
24c24
< USER=www-data
---
>> USER=root
60c60
< HOME=/var/www
---
> HOME=/root
65c65
< LOGNAME=www-data
---
> LOGNAME=root
84c84
< USERNAME=www-data
---
> USERNAME=root

I still see the error:

...
Failed to connect to /usr/lib/libreoffice/program/soffice.bin (pid=10349) in 6 seconds.
...
MartinPaulEve commented 10 years ago

Hmmm.

Can you create another user to which www-data could then sudo for the unoconv call?

On 15/10/14 17:28, may ohta wrote:

|$ finger www-data Login: www-data Name: www-data Directory: /var/www Shell: /usr/sbin/nologin Never logged in. No mail. No Plan. $ sudo chsh www-data Changing the login shell for www-data Enter the new value, or press ENTER for the default Login Shell [/usr/sbin/nologin]: /bin/bash $ finger www-data Login: www-data Name: www-data Directory: /var/www Shell: /usr/bash Never logged in. No mail. No Plan. $ diff < (sudo -E -H -u www-data env) < (sudo -E -H -u root env) 24c24

< USER=www-data

USER=root 60c60

< HOME=/var/www

HOME=/root 65c65

< LOGNAME=www-data

LOGNAME=root 84c84

< USERNAME=www-data

USERNAME=root |

I still see the error:

... Failed to connect to /usr/lib/libreoffice/program/soffice.bin (pid=10349) in 6 seconds. ...

— Reply to this email directly or view it on GitHub https://github.com/withanage/HEIDIEditor/issues/40#issuecomment-59234098.

Dr. Martin Paul Eve Lecturer in English Literature University of Lincoln

E: meve@lincoln.ac.uk W: https://www.martineve.com

Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit: Writing Around Pynchon (https://www.pynchon.net) Web Editor, Alluvium, (http://www.alluvium-journal.org)

may- commented 10 years ago

I'm not sure, if I did exactly how you intended...

$ sudo adduser apache
Adding user `apache' ...
Adding new group `apache' (1003) ...
Adding new user `apache' (1003) with group `apache' ...
Creating home directory `/home/apache' ...
Copying files from `/etc/skel' ...
Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully
Changing the user information for apache
Enter the new value, or press ENTER for the default
    Full Name []: apache
    Room Number []: 
    Work Phone []: 
    Home Phone []: 
    Other []: 
Is the information correct? [Y/n] 
$ sudo adduser apache sudo
Adding user `apache' to group `sudo' ...
Adding user apache to group sudo
Done.
$ sudo -u apache unoconv -f docx -o out_file.docx -vvv in_file.doc 
Verbosity set to level 3
Using office base path: /usr/lib/libreoffice
Using office binary path: /usr/lib/libreoffice/program
DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext
DEBUG: Existing listener not found.
DEBUG: Launching our own listener using /usr/lib/libreoffice/program/soffice.bin.
LibreOffice listener successfully started. (pid=10849)
Failed to connect to /usr/lib/libreoffice/program/soffice.bin (pid=10849) in 6 seconds.
Connector : couldn't connect to socket (Success)
Error: Unable to connect or start own listener. Aborting.

oh, still fails!

$ sudo -u apache sudo unoconv -f docx -o out_file.docx -vvv in_file.doc 
[sudo] password for apache: 
Verbosity set to level 3
Using office base path: /usr/lib/libreoffice
Using office binary path: /usr/lib/libreoffice/program
DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext
DEBUG: Existing listener not found.
DEBUG: Launching our own listener using /usr/lib/libreoffice/program/soffice.bin.
LibreOffice listener successfully started. (pid=10920)
Input file: /home/m1o/Desktop/in_file.doc
Selected output format: Microsoft Office Open XML [.docx]
Selected office filter: Office Open XML Text
Used doctype: document
Output file: /home/m1o/Desktop/out_file.docx
DEBUG: Terminating LibreOffice instance.
DEBUG: Waiting for LibreOffice instance to exit.

That means, I should change the apache configuration file so:

#User ${APACHE_RUN_USER}
#Group ${APACHE_RUN_GROUP}
User apache
Group apache

and call unoconv in metypeset with "sudo" prefix?

MartinPaulEve commented 10 years ago

Hmmm, so why does this work for the "normal user" that you posted earlier? Is it a problem with calling unoconv through sudo, perhaps? Can you do:

sudo www-data -i

then the unoconv command and see what happens?

Re. the below: no, do not run sudo like that. You'd be putting unfiltered input into a root shell. I meant sudo to changeuser to apache (but that didn't work).

On 15/10/14 17:50, may ohta wrote:

I'm not sure, if I did exactly how you intended...

$ sudo adduser apache Adding user apache' ... Adding new groupapache' (1003) ... Adding new user apache' (1003) with groupapache' ... Creating home directory /home/apache' ... Copying files from/etc/skel' ... Enter new UNIX password: Retype new UNIX password: passwd: password updated successfully Changing the user information for apache Enter the new value, or press ENTER for the default Full Name []: apache Room Number []: Work Phone []: Home Phone []: Other []: Is the information correct? [Y/n]
$ sudo adduser apache sudo Adding user apache' to groupsudo' ... Adding user apache to group sudo Done.
$ sudo -u apache unoconv -f docx -o out_file.docx -vvv in_file.doc Verbosity set to level 3 Using office base path: /usr/lib/libreoffice Using office binary path: /usr/lib/libreoffice/program DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext DEBUG: Existing listener not found. DEBUG: Launching our own listener using /usr/lib/libreoffice/program/soffice.bin. LibreOffice listener successfully started. (pid=10849) Failed to connect to /usr/lib/libreoffice/program/soffice.bin (pid=10849) in 6 seconds. Connector : couldn't connect to socket (Success) Error: Unable to connect or start own listener. Aborting.

oh, still fails!

$ sudo -u apache sudo unoconv -f docx -o out_file.docx -vvv in_file.doc [sudo] password for apache: Verbosity set to level 3 Using office base path: /usr/lib/libreoffice Using office binary path: /usr/lib/libreoffice/program DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext DEBUG: Existing listener not found. DEBUG: Launching our own listener using /usr/lib/libreoffice/program/soffice.bin. LibreOffice listener successfully started. (pid=10920) Input file: /home/m1o/Desktop/Testcorpus_Chapter04.doc Selected output format: Microsoft Office Open XML [.docx] Selected office filter: Office Open XML Text Used doctype: document Output file: /home/m1o/Desktop/out_file.docx DEBUG: Terminating LibreOffice instance. DEBUG: Waiting for LibreOffice instance to exit.

That means, I should change the apache configuration file so:

|#User ${APACHE_RUN_USER}

Group ${APACHE_RUN_GROUP}

User apache Group apache |

and call unoconv in metypeset with "sudo" prefix?

— Reply to this email directly or view it on GitHub https://github.com/withanage/HEIDIEditor/issues/40#issuecomment-59237452.

Dr. Martin Paul Eve Lecturer in English Literature University of Lincoln

E: meve@lincoln.ac.uk W: https://www.martineve.com

Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit: Writing Around Pynchon (https://www.pynchon.net) Web Editor, Alluvium, (http://www.alluvium-journal.org)

may- commented 10 years ago

??? Isn't it a syntax error?

$ sudo www-data -i
[sudo] password for m1o: 
sudo: www-data: command not found

unoconv command always succeeds when I call it just as a normal user(m1o) from my root(/home/m1o).

$ unoconv -f docx -o out_file.docx -vvv in_file.doc 
Verbosity set to level 3
Using office base path: /usr/lib/libreoffice
Using office binary path: /usr/lib/libreoffice/program
DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext
DEBUG: Existing listener not found.
DEBUG: Launching our own listener using /usr/lib/libreoffice/program/soffice.bin.
LibreOffice listener successfully started. (pid=11377)
Input file: /home/m1o/Desktop/in_file.doc
Selected output format: Microsoft Office Open XML [.docx]
Selected office filter: Office Open XML Text
Used doctype: document
Output file: /home/m1o/Desktop/out_file.docx
DEBUG: Terminating LibreOffice instance.
DEBUG: Waiting for LibreOffice instance to exit.

... and I have absolutely no idea about SELinux, but I think our server doesn't have selinux:

$ apt-cache policy selinux
selinux:
  Installed: (none)
  Candidate: 1:0.11
  Version table:
     1:0.11 0
        500 http://de.archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages

Or I've just misunderstood your comment??

Our server is Ubuntu 12.04.5 LTS.

$ uname -a
Linux kjc-sv003 2.6.32-61-server #124-Ubuntu SMP Wed Jun 4 23:18:27 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
MartinPaulEve commented 10 years ago

Sorry, missed -u. I'm basically suggesting sudoing to a full shell On 15 Oct 2014 18:08, "may ohta" notifications@github.com wrote:

??? Isn't it a syntax error?

$ sudo www-data -i [sudo] password for m1o: sudo: www-data: command not found

unoconv command always succeeds when I call it just as a normal user(m1o) from my root(/home/m1o).

$ unoconv -f docx -o out_file.docx -vvv in_file.doc Verbosity set to level 3 Using office base path: /usr/lib/libreoffice Using office binary path: /usr/lib/libreoffice/program DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext DEBUG: Existing listener not found. DEBUG: Launching our own listener using /usr/lib/libreoffice/program/soffice.bin. LibreOffice listener successfully started. (pid=11377) Input file: /home/m1o/Desktop/in_file.doc Selected output format: Microsoft Office Open XML [.docx] Selected office filter: Office Open XML Text Used doctype: document Output file: /home/m1o/Desktop/out_file.docx DEBUG: Terminating LibreOffice instance. DEBUG: Waiting for LibreOffice instance to exit.


... and I have absolutely no idea about SELinux, but I think our server doesn't have selinux:

$ apt-cache policy selinux selinux: Installed: (none) Candidate: 1:0.11 Version table: 1:0.11 0 500 http://de.archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages

Or I've just misunderstood your comment??

Our server is Ubuntu 12.04.5 LTS.

$ uname -a Linux kjc-sv003 2.6.32-61-server #124-Ubuntu SMP Wed Jun 4 23:18:27 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

— Reply to this email directly or view it on GitHub https://github.com/withanage/HEIDIEditor/issues/40#issuecomment-59240097 .

axfelix commented 10 years ago

SElinux shouldn't be a problem on Ubuntu, so that's ruled out.

may- commented 10 years ago

Do you mean su www-data? Unfortunately, I don't have the root passwd... I have to ask Dulip.

Is there a way to change the login user without root passwd?

may- commented 10 years ago

Sorry, I could change the login user to www-data without root passwd. unoconv still fails..

www-data@kjc-sv003:/home/m1o$ unoconv -f docx -o /home/m1o/Desktop/out_file.docx -vvv /home/m1o/Desktop/in_file.doc 
Verbosity set to level 3
Using office base path: /usr/lib/libreoffice
Using office binary path: /usr/lib/libreoffice/program
DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext
DEBUG: Existing listener not found.
DEBUG: Launching our own listener using /usr/lib/libreoffice/program/soffice.bin.
LibreOffice listener successfully started. (pid=12130)
DEBUG: Process /usr/lib/libreoffice/program/soffice.bin (pid=12130) exited with 77.
Error: Unable to connect or start own listener. Aborting.
MartinPaulEve commented 10 years ago

Hi Mayu,

Sorry for delayed reply - at an event all day.

I'm not sure what to recommend here. This seems to be an environment problem and it's hard to debug via Q&A....

Sorry to not have a better answer.

Best wishes,

Martin On 15 Oct 2014 18:37, "may ohta" notifications@github.com wrote:

Sorry, I could change the login user to www-data without root passwd. unoconv still fails..

www-data@kjc-sv003:/home/m1o$ unoconv -f docx -o /home/m1o/Desktop/out_file.docx -vvv /home/m1o/Desktop/in_file.doc Verbosity set to level 3 Using office base path: /usr/lib/libreoffice Using office binary path: /usr/lib/libreoffice/program DEBUG: Connection type: socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext DEBUG: Existing listener not found. DEBUG: Launching our own listener using /usr/lib/libreoffice/program/soffice.bin. LibreOffice listener successfully started. (pid=12130) DEBUG: Process /usr/lib/libreoffice/program/soffice.bin (pid=12130) exited with 77. Error: Unable to connect or start own listener. Aborting.

— Reply to this email directly or view it on GitHub https://github.com/withanage/HEIDIEditor/issues/40#issuecomment-59244613 .

may- commented 10 years ago

Hi,

Dulip said he will check it. Your suggestions helped me a lot. Thanks anyway!

nwp90 commented 8 years ago

Assuming your webserver runs as www-data, I suspect you need to create ~www-data/.config (on Debian this will be /var/www/.config) and make it writable by www-data:

sudo mkdir ~www-data/.config sudo chown www-data ~www-data/.config sudo chmod 700 ~www-data/.config

as libreoffice needs to be able to write config files here (or in one of several other places it tries, but this is its favourite).

withanage commented 8 years ago

thanks a lot for the advice @nwp90

mohammedyunus009 commented 5 years ago

Try this script too


import subprocess
import glob
import threading

files = glob.glob("./res/*")
def threa(file):
    print ( subprocess.call(['soffice', '--invisible','--headless','--convert-to', 'html',file]) )
#     print ( subprocess.call(["unoconv", "-d" ,"document", "--format=html", file]) )

for i in files:
    t1 = threading.Thread(target= threa , args=(i,))
    t1.start()

Even this will give the same error , So the base problem with libreoffice is , it cannot open many instances of libreoffice (In short u can not multi thread to convert the documents ) , Solution : The callee should ensure there is no instance of libre office open. As reffered in Issue