riboseinc / attr_masker

A fork of attr_encrypted, repurposed to handle data masking in test databases
MIT License
17 stars 8 forks source link

More masking options! #5

Open ribose-jeffreylau opened 7 years ago

ribose-jeffreylau commented 7 years ago
skalee commented 7 years ago

Currently masking logic is customized via two parameters: :masker and :mask_method. For example, when masking is specified as attr_masker :some_attr, masker: A, mask_method, :b, the method A.b(value: value_of_some_attr) will be called to get the masked value.

I would like to simplify it. I want to get rid of the :mask_method option, only the :masker option would be supported. Also, I'd rather call the #call method instead of #mask, so that Procs would be supported out of the box.

Moreover, I'd like to change the parameters send to that masker object. At the moment we're sending a hash of options, but I want to send an attribute value as the first argument. This way we a very simple procs like ->(value) { '*' * value.size } would be supported as well. The other things (self, attr_masker options, etc.) could be send as subsequent arguments as long as the masker method's arity allows that.

cc @ronaldtse @abunashir

ronaldtse commented 7 years ago

I agree with @skalee on the usage simplification, this still allows the usage of custom masking objects in the simplest form a Proc 👍

skalee commented 7 years ago

Also guys, do we support marshalling as in the original gem? This is totally okay to me, but I suppose this feature is useful than other ones. Perhaps it's better to remove it rather than supporting it. What do you think?

ronaldtse commented 7 years ago

I think we should still support marshaling but maybe in a Rails-native way? Or as long as there is a way to support masking marshaled data without needing marshall support within the gem, it is good enough.

skalee commented 7 years ago

I wonder what the :column_name option is for. Apparently it overrides the column name when it's different from the attribute name. However, I cannot think of a good reason to support it at the moment. It isn't present in the original gem, and has been added in commit 82cab5342abb6fb3b9f, so I bet it is used by one of your projects. Therefore, I ask you for some concrete example or green light on its removal.

cc: @ronaldtse @abunashir

ronaldtse commented 7 years ago

@skalee I suspect it was used to support attributes that uses different column names like attr_encrypted does, but it might no longer be used. @ribose-jeffreylau should be able to clarify this with authority since he wrote it.

skalee commented 7 years ago

The discussion on the :column_name option is continued in #22.

skalee commented 7 years ago

The "Proc as parameter" has been fixed with #23.

skalee commented 7 years ago

BTW options :key and :encode have been removed in 2e8f58618afc9a4ace39947dd35f600b5b683eee and 56a9ee092c1e22281c343906c575f990857d3c10, respectively (PR #21). I believe no one is missing these options.

ronaldtse commented 7 years ago

I believe you're right @skalee. Will continue on #22. Thanks!

skalee commented 7 years ago

So I wonder what new built-in maskers can be introduced.

  1. Ability to mask e-mails is an obvious requirement. However, that can be achieved easily with procs, so I doubt we need a dedicated masker. New e-mails can be even generated basing on existing ones, for instance:
->(value:) { "email+#{value.gsub("@", "_at_")}@example.com" }
  1. Ability to mask passwords is another obvious requirement. It should be possible to set a common password for all users.

  2. You've suggested a built-in masker for structured text, like HTML or Markdown. But I have a feeling that it may produce weird and odd-looking results. Maybe if I preserve the number of words (or even letters) in every HTML element… I don't know.

What else could we mask in a way which is common enough so that a built-in masker is advocated? What do you mean by scrambling algorithms, precisely? I guess replacing sensitive values with some legit-looking random ones is a better idea.

cc @ronaldtse

ronaldtse commented 7 years ago

I think we could bridge the classes of the fake data gems (faker / well_read_faker / ffaker) so you can just do:

attr_masker :email, masker: :email, unique: true
attr_masker :last_name, masker: AttrMasker::Maskers::Name
attr_masker :telephone, masker: :tel
attr_masker :html_content, masker: AttrMasker::Maskers::HtmlIliad
skalee commented 7 years ago

Right now you can do:

attr_masker :email, masker: proc { "#{SecureRandom.hex(10)}@example.com" }
attr_masker :last_name, masker: proc { FFaker::Name.last_name }
attr_masker :telephone, masker: proc { FFaker::PhoneNumber.short_phone_number }
attr_masker :html_content, masker: proc { FFaker::HTMLIpsum.fancy_string }

IMHO bridging faker etc. is nothing but a maintenance burden. I mean implementing built-in maskers make sense only when masked values are derived from the original ones, or random value generator isn't that trivial.

ronaldtse commented 7 years ago

That's true @skalee . Less burden is always better! 👍

ribose-jeffreylau commented 7 years ago

Regarding:

The motivation is to use the huge repository of the different combinations of HTML structures already available to us, so by just applying masks, our tests could already have a rich stash of test HTML to work on. This is useful for testing out stylesheets and wysiwyg editors, for instance.

But I believe this is not essential, as it can easily be written by the gem user in the Proc.

ribose-jeffreylau commented 6 years ago

It seems we'd also need something like the following:

attr_masker :stuff, masker: ->(model:) { model.species =~ /cat/ ? 'miao' : 'woof' }

i.e. the ability for the proc to retrieve the model being masked.

ronaldtse commented 6 years ago

Agree with @ribose-jeffreylau . @skalee would you have time to implement this?

ribose-jeffreylau commented 6 years ago

@ronaldtse @skalee Actually I'm already working on it right now :)