cadmiumcr / cadmium

Natural Language Processing (NLP) library for Crystal
https://cadmiumcr.com
MIT License
205 stars 15 forks source link

Binary dependent on hard coded data path #12

Closed rmarronnier closed 5 years ago

rmarronnier commented 5 years ago

Great job job on this fantastic lib !

I'm developing a web app deployed on Heroku and after the crystal binary is compiled all librairies / shards are discarded.

My app compiles fine with cadmium but crashes because it can't find files inside the data folder :

2019-07-08T20:22:33.238998+00:00 app[web.1]: Unhandled exception: Error opening file '/tmp/build_b996c138b6ca4a4c639a8e6ba2a58942/lib/cadmium/src/cadmium/../../data/sentiment.txt' with mode 'r': No such file or directory (Errno)
2019-07-08T20:22:33.239028+00:00 app[web.1]: from /tmp/crystal/share/crystal/src/crystal/system/file.cr:7:7 in 'new'
2019-07-08T20:22:33.239031+00:00 app[web.1]: from /tmp/crystal/share/crystal/src/file.cr:591:12 in 'read'
2019-07-08T20:22:33.239033+00:00 app[web.1]: from /tmp/crystal/share/crystal/src/path.cr:102:5 in '__crystal_main'
2019-07-08T20:22:33.239036+00:00 app[web.1]: from /tmp/crystal/share/crystal/src/crystal/main.cr:47:14 in 'main'
2019-07-08T20:22:33.239038+00:00 app[web.1]: from /tmp/crystal/share/crystal/src/hash.cr:63:3 in '__libc_start_main'
2019-07-08T20:22:33.239040+00:00 app[web.1]: from _start
2019-07-08T20:22:33.239042+00:00 app[web.1]: from ???

Should binaries compiled against Cadmium be dependent on Cadmium being present on the target computer ?

I think not, but there are several solutions to this problem :

1 - Put all data inside .cr files in arrays / hashes / tuples

2 - Make the path to data files configurable (via a yml config file or ENV variables)

3 - ??? I can't think of more :-)

If I missed something in the docs, feel free to close this issue ;-)

watzon commented 5 years ago

Ahh I know the problem. The problem isn't that the path is hardcoded so much as it is that the file isn't being loaded into the binary like I thought it was. Should be an easy fix, so I'll get back to you soon.

rmarronnier commented 5 years ago

Wow, that was fast ! Thanks :-D

watzon commented 5 years ago

I try! So currently files are being read in as Strings, I just need to make it so they get read in as StringLiterals using the read_file macro. I should have it fixed by this evening.

rmarronnier commented 5 years ago

Yeah, I saw the gitter chat :-) No stress mate. I'm heading for bed now. But I'll sure be telling this tale (in an article or to my grand children) of how crazy fast this bug was fixed :-p

watzon commented 5 years ago

Should be all good now. I also added the ability to set custom sentiment data just in case you want to use your own.

rmarronnier commented 5 years ago

I tried compiling again, and now it fails on compilation with the same error :

in lib/cadmium/src/cadmium/sentiment.cr:42: Error opening file './data/sentiment.txt' with mode 'r': No such file or directory

      @@data ||= {{ read_file("./data/sentiment.txt") }}

You can try a

crystal spec

It seems you made a typo in the data path :

@@data ||= {{ read_file("./data/sentiment.txt") }}

Instead of the old one :

@@data = File.read(File.join(__DIR__, "../../data/sentiment.txt"))
rmarronnier commented 5 years ago
@@data ||= {{ read_file("#{__DIR__}/../../data/sentiment.txt") }}

This works. Do you want me to do a PR ?

watzon commented 5 years ago

@rmarronnier yes please. I thought removing the __DIR__ would be fine, but I forgot that when you include Cadmium in another project your cwd changes.