jellybob / mimemagic

Mime type detection in ruby via file extension or file content
https://github.com/minad/mimemagic
MIT License
10 stars 6 forks source link

Externalise source data #3

Closed jellybob closed 3 years ago

jellybob commented 3 years ago

See #2 for full details of what's going on here.

To ensure compliance with the upstream license on the Freedesktop.org Shared Mime Types database this PR removes it from the gem, and instead requires the user to provide one themselves. Most Linux distributions probably have a copy of this already, and it can probably be found in the location being checked in ext/mimemagic/extconf.rb, so we default to that. Otherwise an environment variable will need to be set to point it in the right direction. This path also gets baked into the installed version of the gem via a Gemfile pretending to be a native extension.

DacotahHarvey commented 3 years ago

Consider adding TODO statements that reference an issue for the

Unknown if this test failure is expected. Commenting out for now

sections. It will greatly help the longevity of the project to have a reminder that there are outstanding questions for @minad

yorickpeterse commented 3 years ago

@jellybob IIRC you don't have to build an actual C extension. Years ago I wrote a bit about this here. Unless things have changed, you should be able to use a Rakefile instead. This way you probably don't need any actual C code. I hope this info is of any use.

pboling commented 3 years ago

As for Mac OS compatibility I am investigating the Homebrew path. The main prerequisite of shared-mime-info is glib2, which is actually just glib in Homebrew.

$ brew install glib
==> Downloading https://homebrew.bintray.com/bottles/libffi-3.3_3.big_sur.bottle.tar.gz
######################################################################## 100.0%
==> Downloading https://homebrew.bintray.com/bottles/sqlite-3.35.2.big_sur.bottle.tar.gz
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/db0fa1be882b6184dc9fa4ba8ca5aefbd9c6a1441b
######################################################################## 100.0%
==> Downloading https://homebrew.bintray.com/bottles/glib-2.68.0.big_sur.bottle.tar.gz
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/7d671e3104d1a3e8d620ef99b4a1c9b237362e60e8
######################################################################## 100.0%
... etc

The install for shared-mime-info itself is:

$ brew upgrade shared-mime-info
brew install shared-mime-info
==> Downloading https://homebrew.bintray.com/bottles/libffi-3.3_3.big_sur.bottle.tar.gz
Already downloaded: /Users/pboling/Library/Caches/Homebrew/downloads/60b45c0f23d19cde24cfc8e6834288901010f39f7733d9b3312e759a58229193--libffi-3.3_3.big_sur.bottle.tar.gz
==> Downloading https://homebrew.bintray.com/bottles/sqlite-3.35.2.big_sur.bottle.tar.gz
Already downloaded: /Users/pboling/Library/Caches/Homebrew/downloads/0ece5a897ef993b09cf8fdab929a7f2392b1057799cae7c89aaba8ca507e243a--sqlite-3.35.2.big_sur.bottle.tar.gz
==> Downloading https://homebrew.bintray.com/bottles/glib-2.68.0.big_sur.bottle.tar.gz
Already downloaded: /Users/pboling/Library/Caches/Homebrew/downloads/34a93ed75ca6c68aa1dc16c348870beef563edd005994b4c0ccb88b8e3628fa0--glib-2.68.0.big_sur.bottle.tar.gz
==> Downloading https://homebrew.bintray.com/bottles/gnu-getopt-2.36.2.big_sur.bottle.tar.gz
######################################################################## 100.0%
==> Downloading https://homebrew.bintray.com/bottles/shared-mime-info-2.1.big_sur.bottle.tar.gz
==> Downloading from https://d29vzk4ow07wi7.cloudfront.net/4857d9f38c0f3cbf23984d60c4ec6280d84b457123
######################################################################## 100.0%
... etc

The source file for a MacOS BigSur install of shared-mime-info-2.1 is usr/local/Cellar/shared-mime-info/2.1/share/shared-mime-info/packages/freedesktop.org.xml:

$ ls -la /usr/local/Cellar/shared-mime-info/2.1/share/shared-mime-info/packages/freedesktop.org.xml
-rw-r--r--  1 pboling  staff  2375737 Mar 24 16:18 /usr/local/Cellar/shared-mime-info/2.1/share/shared-mime-info/packages/freedesktop.org.xml
pboling commented 3 years ago

@jellybob I'd like to be able to configure the ENV variable via bundle config, which is what we do for nokogiri, mysql2 and many other gems.

bundle config set --global build.mimemagic --with-freedesktop-mime-types-path=/usr/local/Cellar/shared-mime-info/2.1/share/shared-mime-info/packages/freedesktop.org.xml

Which would result in this:

$ bundle config

Settings are listed in order of priority. The top value will be used.
build.libxml-ruby
Set for the current user (/Users/pboling/.bundle/config): "--with-xml2-config=/usr/local/opt/libxml2/bin/xml2-config --with-xml2-dir=/usr/local/opt/libxml2 --with-xml2-lib=/usr/local/opt/libxml2/lib --with-xml2-include=/usr/local/opt/libxml2/include"

build.mysql2
Set for the current user (/Users/pboling/.bundle/config): "--with-opt-dir=/usr/local/opt/openssl@1.1"

gem.test
Set for the current user (/Users/pboling/.bundle/config): "rspec"

gem.mit
Set for the current user (/Users/pboling/.bundle/config): true

build.mimemagic
Set for the current user (/Users/pboling/.bundle/config): "--with-freedesktop-mime-types-path=/usr/local/Cellar/shared-mime-info/2.1/share/shared-mime-info/packages/freedesktop.org.xml"

See the last lines! Then we just need to utilize the bundle config from within the gem...

this bit of the config lets me build nokogiri:

build.libxml-ruby
Set for the current user (/Users/pboling/.bundle/config): "--with-xml2-config=/usr/local/opt/libxml2/bin/xml2-config --with-xml2-dir=/usr/local/opt/libxml2 --with-xml2-lib=/usr/local/opt/libxml2/lib --with-xml2-include=/usr/local/opt/libxml2/include"

this bit of the config lets me build mysql2:

build.mysql2
Set for the current user (/Users/pboling/.bundle/config): "--with-opt-dir=/usr/local/opt/openssl@1.1"

My understanding is that the --with* bits get passed on to the gems when they are being built, such that it is equivalent to, e.g.:

gem install nokogiri --with-xml2-config=/usr/local/opt/libxml2/bin/xml2-config --with-xml2-dir=/usr/local/opt/libxml2 --with-xml2-lib=/usr/local/opt/libxml2/lib --with-xml2-include=/usr/local/opt/libxml2/include

And we'd just need it to work the same way here.

elebow commented 3 years ago

Here's a branch demonstrating how to use extconf.rb without actually building any C. You can cherry-pick this and build on it if you find it useful.

https://github.com/jellybob/mimemagic/compare/externalise-source-data...elebow:externalise-source-data-no-c

This also makes some other changes:

  1. Download the file at install-time, but decide which file to open at runtime. This lets the user supply ENV["FREEDESKTOP_MIME_TYPES_PATH"] after the gem has been installed, or change locations without reinstalling the gem.

  2. Download the file only if the user requests it with --with-download-freedesktop-database.

  3. Refactor the way we check possible paths for maintainability—we can add more possible paths to the array in MimeMagic.database_path.

jellybob commented 3 years ago

@jellybob IIRC you don't have to build an actual C extension. Years ago I wrote a bit about this here. Unless things have changed, you should be able to use a Rakefile instead. This way you probably don't need any actual C code. I hope this info is of any use.

🎉 Thanks for this - just about to push a version that does this without a C extension now.

jellybob commented 3 years ago

@elebow thanks for that - I'd actually implemented the version without an extension before seeing your comment, so didn't make use of that. I'd definitely be interested in a PR to support downloading the file if a version that can be used in that manner can be found though.

jellybob commented 3 years ago

@pboling a PR to support this would be welcome, but at least for now I think I'm going to go with the current implementation for the sake of getting something shipped, and build unblocked, even if that is in an initially suboptimal way.

mvz commented 3 years ago

How about rebasing this on the 0.3.5 version to avoid

I tried this and there were some merge conflicts but nothing too serious. See this branch: https://github.com/Docbldr/mimemagic/tree/externalise-source-data-rebased

jellybob commented 3 years ago

@mvz I'm not overly concerned about rebasing from 0.3.5 in terms of licensing - we've been talking to the copyright holder who raised the initial issue, and there's agreement there that the approach being taken resolves their complaint, and no issues with reverting the license to MIT. On the other point around mimemagic/overlay, I'm honestly not familiar enough with this code to know what the implications are. I'm going to merge the changes I've made into master on this repository as I'm reasonably happy with them, so feel free to open a PR to reintroduce that.

Deradon commented 3 years ago

When releasing this as a 0.3.x version will this possibly break on windows machines if you're creating a new rails project? Rails still has a soft-dependency on 0.3.x. So I'ld personally like to at least bump the minor version so the rails maintainers can explicitly decide to go w/ the approach here.

jellybob commented 3 years ago

Yes, it will potentially break creating a new Rails project on Windows, but that feels like a better option than quietly imposing GPL 2 licensing on every new Rails project which is the current situation. We're also looking at yanking 0.3.6 because of those licensing implications, so in practice whatever happens here new Rails projects on Windows are going to be broken to some degree.

mvz commented 3 years ago

@jellybob Thanks for all your work in resolving this! I've opened a pull request to reintroduce overlay.rb to improve compatibility with older 0.3.x versions.

pboling commented 3 years ago

I agree with yanking 0.3.6. I am sure that many people updated to it to make things work without understanding or noticing the licensing changes. I have personally seen MIT-licensed open source projects that did this. For their benefit and safety, as well as that of the closed source apps we can't see, we should yank 0.3.6.

enriclluelles commented 3 years ago

@jellybob Thanks so much for your hard work on this and ensuring that the gem can be used with the file loaded at runtime and no behavioral changes!

Much appreciated

pboling commented 3 years ago

I would probably even yank 0.4.0, TBH. I wouldn't expect a drastic license change in anything but a major version or a name change, and would prefer a name change. The GPL version should be renamed to mimemagic-gpl 1.0.0

jellybob commented 3 years ago

Yup - 0.3.6 and 0.4.0 have both been yanked.