Closed cars10 closed 3 years ago
oft
and accsdb
I can't get to work by name with 0.3.3, maybe they only work by content matching. I'll work on patching the others into the current types db in the meantime. If you could edit this test case to reproduce the type matching you describe in 0.3.3, that would be helpful. Thanks!
# frozen_string_literal: true
require "bundler/inline"
gemfile(true) do
source "https://rubygems.org"
git_source(:github) { |repo| "https://github.com/#{repo}.git" }
# gem "marcel", github: "rails/marcel", branch: "main"
gem "marcel", "0.3.3"
gem "minitest"
end
require "minitest"
require "minitest/autorun"
require "marcel"
class BugTest < Minitest::Test
{ oft: "application/octet-stream", accesdb: "application/octet-stream", mdb: "application/vnd.ms-access", mht: "application/x-mimearchive" }.each do |ext, type|
define_method("test_#{ext}") do
assert_equal type, Marcel::MimeType.for(name: "file.#{ext}")
end
end
end
office_files.zip Hi, some errors on my side:
accdb
not accsdb
, and this was reported as application/octet-stream
in marcel 0.3.3
. Not great, but better than font/ttf
.oft
was reported before as application/x-ole-storage
, which is what we currently use for our whitelist. But we can change that ofc if you want to report it as application/vnd.ms-outlook
like the link that i found in the opener suggests.The following code works with the attached files:
# frozen_string_literal: true
require 'bundler/inline'
gemfile(true) do
source 'https://rubygems.org'
git_source(:github) { |repo| "https://github.com/#{repo}.git" }
# gem "marcel", github: "rails/marcel", branch: "main"
gem 'marcel', '0.3.3'
gem 'minitest'
end
require 'minitest'
require 'minitest/autorun'
require 'marcel'
class BugTest < Minitest::Test
{
oft: 'application/x-ole-storage',
accdb: 'application/octet-stream',
mdb: 'application/vnd.ms-access',
mde: 'application/vnd.ms-access',
mht: 'message/rfc822'
}.each do |ext, type|
define_method("test_#{ext}") do
assert_equal type, Marcel::MimeType.for(Pathname.new("files/#{ext}.#{ext}"))
end
end
end
It looks like the TTF identification bug is being caused by a manual mime extension: https://github.com/rails/marcel/blob/a525d5b38f287ca0511c8eb26e657a1d46686e5f/lib/marcel/mime_type/definitions.rb#L40
We either need to define magic for accdb/mdb files or adjust the magic matcher for TTFs. Preferably, both access DB types should be able to identify as application/vnd.ms-access
.
I can't find a good magic matcher for oft
files. Microsoft's specs for file naming only mention Word, PowerPoint, and Excel.
If it isn't feasible to match to a more specific type for outlook files, we may want to look at falling back to application/x-ole-storage
in cases where we would return the new tika type.
According to this the mimetype of
.oft
(outlook email emplate) should beapplication/vnd.ms-outlook
, but marcel currently reportsapplication/x-tika-msoffice
.Some other issues include:
.accdb
,.mdb
and.mde
are reported asfont/ttf
.mht
is reported asmultipart/related
All of these files worked on marcel
0.3.3