opensemanticsearch / open-semantic-search

Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
https://opensemanticsearch.org
GNU General Public License v3.0
943 stars 164 forks source link

Open Semantic Search Server - Bad File URL #118

Open blion-tec opened 6 years ago

blion-tec commented 6 years ago

Hello! I've just indexed a bunch of files and realized that all the files are indexed as in the Desktop version, with file://localhost/ type URL, so they aren't accessible in any way. Is there a way to correct the references so the files are accessible outside of the box, from any machine web browser? Example: 34527408_10156367071797246_7736711648995442688_n

YoannMR commented 6 years ago

Hi,

Not sure if this is what you need but see below my notes on how to be able to open files from a remote server (EC2 instance) in a web browser:

How to enable opening files on EC2 instance from a web browser:

Also a disturbing thing about setting up the IP address in the config file: I set it up once on an old instance. I now use the same file, with same old IP address on a different instance with different IP but it still works (ie: I can open a pdf file within the webbrowser).

Disclaimer: I'm just a random user of OSS, so don't take my advice for anything more than just what I found out through errors and trials.

Hope that helps! Yoann

rafael844 commented 5 years ago

Did everything as you said, but its not working. Im using OSS Desktop. Shared the folder sged in the VM created the link: ln -s /home/user/Documents/sf_sged/ and chmod 777 /home/user/Documents/sf_sged/

the /etc/opensemanticsearch/connector-files: config['mappings'] = { "/var/www/html/": "http://10.200.52.140/" }

the apache2.conf Listen 10.200.52.140:80

but it not works, It indexes, and I can see the preview and Tagging & annotation fine, but when I try to open the file it retrives: Not Found The requested URL /home/user/Documents/sf_sged/83/id_127_File.pdf was not found on this server.

Whats could be wrong ?

Olydia commented 5 years ago

I'm having the same issue, I did everything still I can't open a file on OSS server ? Any idea on how to fix the problem Thank you ! :)

barrydegraaff commented 3 years ago

yeah this is definitively annoying, as changing the setting config['mappings'] requires a re-index. And that takes days. As a new user installing the opensemanticsearch server package I was not aware of the fact that this used to be a desktop application. Putting file:// links in a web application is IMHO not that useful, as that means it is a single-user/single-computer web application.

It would be nice if the path override could be configured to just change the links in the web-ui, so re-indexing would not be necessary. In the mean time, I created a browser extension with some javascript that re-writes my opensemanticsearch urls.

barrydegraaff commented 3 years ago

In the end all I needed was: config['mappings'] = { "/": "https://mydomain.com/" }

And I purged my entire installation and reindexed.

Since some traces are left behind after purging, I also had a background task to make sure no ETL process could start before I could copy a pre-configured config file into place, basically cp'ing /root/connector-files /etc/opensemanticsearch/connector-files all the time.

ufukayyildiz commented 2 years ago

Did everything as you said, but its not working. Im using OSS Desktop. Shared the folder sged in the VM created the link: ln -s /home/user/Documents/sf_sged/ and chmod 777 /home/user/Documents/sf_sged/

the /etc/opensemanticsearch/connector-files: config['mappings'] = { "/var/www/html/": "http://10.200.52.140/" }

the apache2.conf Listen 10.200.52.140:80

but it not works, It indexes, and I can see the preview and Tagging & annotation fine, but when I try to open the file it retrives: Not Found The requested URL /home/user/Documents/sf_sged/83/id_127_File.pdf was not found on this server.

Whats could be wrong ?

worked for me. thanks