elwerene / libreoffice-convert

MIT License
241 stars 94 forks source link

Not throw exception when convert an unconverable file, how to set a timeout? #119

Closed toknT closed 2 weeks ago

toknT commented 1 month ago

In my case I need convert docx file to html and display it to user.

One of my user upload a file that's even unable to convert to html from server with command libreoffice --convert-to html error.docx

ubuntu@VM-12-9-ubuntu ~/tmp> ls
error.docx  lu28697pn1w.tmp  lu28879q70o.tmp  lu623234iaq3.tmp  ok.docx
ubuntu@VM-12-9-ubuntu ~/tmp> libreoffice --convert-to html ok.docx
convert /home/ubuntu/tmp/ok.docx as a Writer document -> /home/ubuntu/tmp/ok.html using filter : HTML (StarWriter)
ubuntu@VM-12-9-ubuntu ~/tmp> libreoffice --convert-to html error.docx
convert /home/ubuntu/tmp/error.docx as a Writer document -> /home/ubuntu/tmp/error.html using filter : HTML (StarWriter)
# the server crash and keep doing convert with 100% cpu lol

I run these code on server , but it not throw any exception when process the unconverable file.

try {
      const inputPath = filePath;
      const outputPath = filePath.replace('.docx', '.html');
      const docxBuf = await fs.readFileSync(inputPath);
      const pdfBuf = await libre.convertAsync(
        docxBuf,
        'html:HTML:EmbedImages',
        undefined,
      );
      fs.writeFileSync(outputPath, pdfBuf);
      fs.rmSync(filePath);
      return outputPath;
    } catch (error) {
      Logger.error('convertWordFileToHtml error : tmp file=>' + filePath);
      Logger.error('convertWordFileToHtml error :' + JSON.stringify(error));
      return '';
    }
toknT commented 1 month ago

Hey it's a really enemcy problem , no any update?...May be libreoffice has too many bus when reading the file uploaded by users wrtite in Chinese.

elwerene commented 1 month ago

Maybe you add a timeout and kill the process. You could implement it and send a merge request, I will review and publish it if it looks good to me :)

toknT commented 2 weeks ago

@elwerene Sorry , it already existed a npm package libreoffice-file-converter has timeout feat . And I am an amateur nodejser/backend developer. Have no skill to do a pr ,my main job is mobile app developer may need more learning to do that😅.

So I chose to use libreoffice-file-converter.