jellybob / mimemagic

Mime type detection in ruby via file extension or file content
https://github.com/minad/mimemagic
MIT License
10 stars 6 forks source link

Support loading data at runtime, or by configuring a different location for a preinstalled version #1

Closed jellybob closed 3 years ago

jellybob commented 3 years ago

No license lawyering please - unless you're someone able to speak on behalf of freedesktop.org your interpretation of the GPL isn't going to add anything to this conversation

See https://github.com/rails/rails/issues/41750 and https://github.com/minad/mimemagic/issues/97 for background.

In order to cause minimal impact on existing users of the mimemagic gem, particularly people using Rails, I'm going to have it load MIME types from a preinstalled version of the Freedesktop MIME types database, rather than bundling one with the gem. This will require having a copy of that either installed with your distribution, or obtained in some other way. The availability of that will be checked at build time.

minad commented 3 years ago

@wwahammy

I'm only worried about not making this brittle. I want to be able to install my bundle and never have to worry about whether some external library is available. My project doesn't have a licensing problem, we don't want a unnecessary burden.

This is perfectly understandable. I simply wonder if it is possible to have two gems implementing the same API (mimemagic and mimemagic-gpl) and exchanging them seamlessly in your stack?

pboling commented 3 years ago

@jellybob @minad @zRedShift

My proposal would be to move the library under the umbrella of some organization, either the Rails organization or a new mimemagic.rb organization maintaining this package.

The golang mimemagic had the same license issue this morning. Perhaps both projects would be willing to join a new MimeMagic org? There are other similar cross-language orgs, such as oauth-xx, which has Python, PHP, and Ruby libraries for OAuth.

pboling commented 3 years ago

@jellybob Re: pulling from the internet

You could leave the user with a gem post-install-hook message, as many other gems have done in the past. It could evaluate the presence, or not, of the source file, and suggest three solutions if it is missing:

  1. Install the source file at the correct location
  2. Set the env variable to indicate a different location
  3. Provide a command the user can run which will download the file to a location which will then be specified by the ENV variable.
ziggythehamster commented 3 years ago

Engineer for a large Rails app, with an information security background, here: please do not download packages at runtime by default. Know all of those stories about abandoned Chrome extensions getting hijacked with things that steal information or run crypto miners? By downloading at runtime, you are creating a channel that allows this sort of thing. There was also an example of this happening with node.js recently as well.

libxml2 and thus Nokogiri very frequently has bugs allowing specially crafted XML files to do naughty things (this is part of the reason why Nokogiri vendors the latest version of libxml2). Doing this - by default - is bad for that reason, but it's also bad for anyone whose application deploy process would start tens of thousands of Ruby processes. If someone wasn't aware of this change and just updated to get their app working, we're talking about creating a DDoS against FDO. Applications fail to start, or if you ignore the error, they fail to work correctly once started. None of this is good, and potentially opens up that if you ignore the error, users inadvertently have vulnerabilities which only appear if the database wasn't able to be downloaded. Nevermind that many apps have restricted egress and wouldn't be able to actually reach FDO to download the file.

My suggestion: use extconf.rb to check the system for the XML file. Treat it like it's a .so you need to link to. As with most gems needing native libraries, allow specifying where to find the library via environment variables. If it's not found, fail to "compile" just like any other Ruby library that needs a shared library would (though probably provide instructions on where to get the file). Load the XML file on boot and build out the Ruby class hierarchy or whatever you need to do.

I agree with others that building a derivative work on install (e.g., via extconf.rb) isn't a clear-cut solution absent some sort of mechanism like the Linux kernel has for license tainting. What I would probably do is make this integrate with Bootsnap somehow, so that if you're using Bootsnap, it only needs to parse the .xml file when you have Bootsnap cache the application's classes and whatnot. I'm not sure the Bootsnap cache is a derivative work, but as the application functions without it and users are already needing to come to grips with this possibility with other libraries, I wouldn't worry about it one way or the other.

jellybob commented 3 years ago

I'd come to much the same conclusion as @ziggythehamster while away from the computer. I'd much rather have build time failures due to the XML file not being available at gem install than run time failures at some random point in the future because a server was unavailable. I'm not hugely familiar with extconf.rb, but I'm going to dig into that and see if I can persuade it to do what's needed.

ziggythehamster commented 3 years ago

Yeah, I'm not either, but I know that's how gems with native code compile that native code. But it's a Ruby file that executes in the install process and it would be ideal to use that as an opportunity to check for the package being available without abusing one of the other hooks in RubyGems.

jellybob commented 3 years ago

PR has been updated, and now explicitly only supports making use of a pre-existing copy of the file. Setting the environment variable at build time will look in that location, and then persist it for use at runtime in the form of a C extension (because the way extconf.rb works requires one to be built as far as I can tell... this was quite the rabbit hole).

@minad for the sake of unblocking the world's (and my own) Rails deployments, I'm going to step up and volunteer to take over maintenance of this gem until a better solution can be found. Either that, or can you (once some review has happened) push this version of the gem to Rubygems as 0.3.7, which should then make people's builds at least throw an error about needing the file, rather than just explode. Feel free to drop me an email on jon@blankpad.net if you want to talk about that somewhere less public.

jellybob commented 3 years ago

@fooishbar do you have a particular location you would like me to direct people to in order to obtain a copy of the compiled freedesktop.org.xml file for use with this gem if they don't already have one? The only freedesktop.org hosted version of that file I can find is freedesktop.org.xml.in as part of the source releases, and I have no idea what the implications of using that vs a built version would be.

ZanderBrown commented 3 years ago

In danger of stating the obvious: A processed version of the file, in this case by gettext, is still GPL

Unprocessed file of course lacks translations

But hey, have a copy of my /usr/share/mime/packages/freedesktop.org.xml as shipped in fc33: freedesktop.org.xml

fooishbar commented 3 years ago

@jellybob I’m afraid we don’t have a canonical location for a post-processed file. To be honest I’m pretty glad we don’t, since we’re not behind a CDN or anything fancy ... there are a lot of Rails installs happening!

Perhaps you could serve the post-processed from a GH repo alongside this one? AIUI the only difference is translations.

ziggythehamster commented 3 years ago

Without having looked at that other PR (I'm trying to land one of my own for an unrelated and hopefully more fun thing), my suggestion would be to say something like:

Install your operating system's shared-mime-info package, or fetch the XML file from the Debian package:

  1. Visit https://packages.debian.org/sid/amd64/shared-mime-info/download and download the .deb file.
  2. Install the command-line version of 7-Zip for your platform (sometimes called p7zip)
  3. Run this command: 7z x -so shared-mime-info_2.0-1_amd64.deb data.tar | 7z e -sidata.tar './usr/share/mime/packages/freedesktop.org.xml'

I would suggest to use ar + tar but ar cannot output to stdout (what) and tar needs --strip-components=4 to not create a hierarchy in the current directory.

ziggythehamster commented 3 years ago

In fact, if you use double quotes for the path, it runs on Windows verbatim:

M:\>7z x -so shared-mime-info_2.0-1_amd64.deb data.tar | 7z e -sidata.tar "./usr/share/mime/packages/freedesktop.org.xml"

7-Zip 19.00 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2019-02-21

Extracting archive: data.tar
--
Path = data.tar
Type = tar
Code Page = UTF-8

Everything is Ok

Folders: 166
Files: 92
Size:       4773833
Compressed: 140800

M:\>dir *.xml
 Volume in drive M is rpool_HOME
 Volume Serial Number is BA16-F57C

 Directory of M:\

10/09/2020  10:26         2,341,534 freedesktop.org.xml
               1 File(s)      2,341,534 bytes
               0 Dir(s)   8,020,033,536 bytes free

So change the above to use double quotes and then the instructions work for every major OS.

jellybob commented 3 years ago

@ziggythehamster this is great, thank you, I've added instructions based on your exploration to the readme.

jellybob commented 3 years ago

3 has now been merged into master here, I'm currently talking with @minad about getting it released.

pezholio commented 3 years ago

Amazing work. Thanks for all your work on this @jellybob and others. I'm sure I speak for all of the community when I say I really appreciate it! 👍

Deradon commented 3 years ago

As #3 has been merged already I'll copy my comment overhere:

When releasing this as a 0.3.x version will this possibly break on windows machines if you're creating a new rails project? Rails still has a soft-dependency on 0.3.x. So I'ld personally like to at least bump the minor version so the rails maintainers can explicitly decide to go w/ the approach here. (I'd like to avoid another left-pad-situation where suddenly rails install fails for a lot of projects)

jellybob commented 3 years ago

Also copying my reply from that thread :)

Yes, it will potentially break creating a new Rails project on Windows, but that feels like a better option than quietly imposing GPL 2 licensing on every new Rails project which is the current situation. We're also looking at yanking 0.3.6 because of those licensing implications, so in practice whatever happens here new Rails projects on Windows are going to be broken to some degree.

Ultimately I don't think there's any way to avoid some degree of pain for users while still complying with license terms, and I consider abiding by those terms to take precedence for both legal and moral reasons.

jellybob commented 3 years ago

Mimemagic has been moved to its own org, so I'm also redirecting discussion of this to the repo over there, as shown above. Closing this one.

Deradon commented 3 years ago

@jellybob Just to let you know, can't comment yet on issues there.

An owner of this repository has limited the ability to comment to users that have contributed to this repository in the past.

Not sure if intended or not.

jellybob commented 3 years ago

@Deradon opening this one back up for now - they are indeed locked down temporarily while we get everything moved over and a release out to try and keep the noise down.

Deradon commented 3 years ago

I'd highly suggest to let the rails maintainers know when you release this version and what implications this release has. Would like to avoid, or at least mitigate, another left-pad-situation where suddenly bundle install does not work in a (new) rails project, for at least some users, and issues pile up in the rails/rails repo.

jellybob commented 3 years ago

Yup, we will be doing that.

jellybob commented 3 years ago

Closing this issue now as 0.3.7 has been released.

ljharb commented 3 years ago

@jellybob just so it’s explicit, can you (and ideally, @hadess) confirm that to the best of your knowledge, v0.3.7 is properly MIT-licensed?

jellybob commented 3 years ago

This has been discussed with both @hadess and @fooishbar, and yes, to the best of knowledge of all of us 0.3.7 is legitimately licensed as MIT since we no longer distribute any GPL licensed data with this gem.

hadess commented 3 years ago

@jellybob just so it’s explicit, can you (and ideally, @hadess) confirm that to the best of your knowledge, v0.3.7 is properly MIT-licensed?

Sorry, but I'm not going to do that. You should ask a lawyer.

I don't intend to file a DMCA takedown request against the repo at this point though ;)

ljharb commented 3 years ago

I indeed have done so :-) but thanks, that’s sufficient for a public answer here.

jellybob commented 3 years ago

@ljharb I'd be really curious to hear the outcome of that if you're happy to share it, either publicly or more privately.

fooishbar commented 3 years ago

Same from the fd.o side; my email is daniel@fooishbar.org and Bastien's is pretty easily findable as well. It would be really good to understand what you've gleaned from this. We've worked with SFC before and they've always been extremely sensible.

ipepe commented 3 years ago

@hadess I wanted to ask about installing dependencies: https://github.com/mimemagicrb/mimemagic#dependencies. As far as I understand, installing those on production environment to use in my project forces my project's source code to be licensed under GPL? Basically shifting the licensing issue from mimemagic gem to me as author of my project?

rubyFeedback commented 3 years ago

installing those on production environment to use in my project forces my project's source code to be licensed under GPL

I heavily doubt that. Look at the linux kernel running via GPLv2 in proprietary environments. IMO there are comments here on the issue tracker that can not possibly be correct, but it probably distracts too much from the main issue at hand to discuss that here. It would be nice to read a post-mortem analysis at a later point, though, simply because I am pretty certain that other projects may be in a somewhat similar situation licence-wise.

pboling commented 3 years ago

@ipepe IANAL - A huge portion of the GNU stack is GPL licensed, and a large chunk of that is installed on millions of machines running corporate software.

The license cares about "distribution". You can install whatever GPL'd code you want on the machine and use the tools together. As long as this gem, and your project, are not "distributed" with GPL-licensed code inside them, then it is safe.

base10 commented 3 years ago

@jellybob Thank you for taking point on this and working through an acceptable solution with @hadess.

brettwgreen commented 3 years ago

Seems to me the previously bundled version of the Freedesktop had some handling for docx files that is no longer handled with the standalone installer... I had to patch in some mimetypes in init of my rails app to get the same behavior as before.

Used the init found here for what it's worth.

https://github.com/mimemagicrb/mimemagic/issues/39#issuecomment-323940672