RIPAGlobal / scimitar

A SCIM v2 API endpoint implementation
MIT License
61 stars 43 forks source link

Scimitar

Gem Version Build Status License

A SCIM v2 API endpoint implementation for Ruby On Rails.

For a list of changes and information on major version upgrades, please see CHANGELOG.md.

Overview

System for Cross-domain Identity Management (SCIM) is a protocol that helps systems synchronise user data between different business systems. A service provider hosts a SCIM API endpoint implementation and the Scimitar gem is used to help quickly build this implementation. One or more enterprise subscribers use these APIs to let that service know about changes in the enterprise's user (employee) list.

In the context of the names used by the SCIM standard, the service that is provided is some kind of software-as-a-service solution that the enterprise subscriber uses to assist with their day to day business. The enterprise maintains its user (employee) list via whatever means it wants, but includes SCIM support so that any third party services it uses can be kept up to date with adds, removals or changes to employee data.

Installation

Install using:

gem install scimitar

In your Gemfile:

gem 'scimitar', '~> 2.0'

Scimitar uses semantic versioning so you can be confident that patch and minor version updates for features, bug fixes and/or security patches will not break your application.

Heritage

Scimitar borrows heavily - to the point of cut-and-paste - from:

All three are provided under the MIT license. Scimitar is too.

Usage

Scimitar is best used with Rails and ActiveRecord, but it can be used with other persistence back-ends too - you just have to do more of the work in controllers using Scimitar's lower level controller subclasses, rather than relying on Scimitar's higher level ActiveRecord abstractions.

Authentication

Noting the Security section later - to set up an authentication method, create a config/initializers/scimitar.rb in your Rails application and define a token-based authenticator and/or a username-password authenticator in the engine configuration section documented in the sample file. For example:

Scimitar.engine_configuration = Scimitar::EngineConfiguration.new({
  token_authenticator: Proc.new do | token, options |

    # This is where you'd write the code to validate `token` - the means by
    # which your application issues tokens to SCIM clients, or validates them,
    # is outside the scope of the gem; the required mechanisms vary by client.
    # More on this can be found in the 'Security' section later.
    #
    SomeLibraryModule.validate_access_token(token)

  end
})

When it comes to token access, Scimitar neither enforces nor presumes any kind of encoding for bearer tokens. You can use anything you like, including encoding/encrypting JWTs if you so wish - https://rubygems.org/gems/jwt may be useful. The way in which a client might integrate with your SCIM service varies by client and you will have to check documentation to see how a token gets conveyed to that client in the first place (e.g. a full OAuth flow with your application, or just a static token generated in some UI which an administrator copies and pastes into their client's SCIM configuration UI).

Strongly recommended: You should wrap any Scimitar configuration with Rails.application.config.to_prepare do... so that any changes you make to configuration during local development are reflected via auto-reload, rather than requiring a server restart.

Rails.application.config.to_prepare do
  Scimitar.engine_configuration = Scimitar::EngineConfiguration.new({
    # ...
  end
end

In general, Scimitar's own development and tests assume this approach. If you choose to put the configuration directly into an initializer file without the to_prepare wrapper, you will be at a slightly higher risk of tripping over unrecognised Scimitar bugs; please make sure that your own application test coverage is reasonably comprehensive.

Routes

For each resource you support, add these lines to your routes.rb:

namespace :scim_v2 do
  mount Scimitar::Engine, at: '/'

  get    'Users',     to: 'users#index'
  get    'Users/:id', to: 'users#show'
  post   'Users',     to: 'users#create'
  put    'Users/:id', to: 'users#replace'
  patch  'Users/:id', to: 'users#update'
  delete 'Users/:id', to: 'users#destroy'
end

All routes then will be available at https://.../scim_v2/... via controllers you write in app/controllers/scim_v2/..., e.g. app/controllers/scim_v2/users_controller.rb. More on controllers later.

URL helpers

Internally Scimitar always invokes URL helpers in the controller layer. I.e. any variable path parameters will be resolved by Rails automatically. If you need more control over the way URLs are generated you can override any URL helper by redefining it in the application controller mixin. See the application_controller_mixin engine configuration option.

Data models

Scimitar assumes that each SCIM resource maps to a single corresponding class in your system. This might be an abstraction over more complex underpinings, but either way, a 1:1 relationship is expected. For example, a SCIM User might map to a User ActiveRecord model in your Rails application, while a SCIM Group might map to some custom class called Team which operates on a more complex set of data "under the hood".

Before writing any controllers, it's a good idea to examine the SCIM specification and figure out how you intend to map SCIM attributes in any resources of interest, to your local data. A mixin is provided which you can include in any plain old Ruby class (including, but not limited to ActiveRecord model classes) - a more readable form of the comments in this file is in the RDoc output.

The functionality exposed by the mixin is relatively complicated because the range of operations that the SCIM API supports is quite extensive. Rather than duplicate all the information here, please see the extensive comments in the mixin linked above for more information. There are examples in the test suite's Rails models, or for another example:

class User < ActiveRecord::Base

  # The attributes in the SCIM section below include a reference to this
  # hypothesised 'groups' HABTM relationship. All of the other "leaf node"
  # Symbols - e.g. ":first_name", ":last_name" - are expected to be defined as
  # accessors e.g. via ActiveRecord and your related database table columns,
  # "attr_accessor" declarations, or bespoke "def foo"/"def foo=(value)". If a
  # write accessor is not present, the attribute will not be writable via SCIM.
  #
  has_and_belongs_to_many :groups

  # ===========================================================================
  # SCIM MIXIN AND REQUIRED METHODS
  # ===========================================================================
  #
  # All class methods shown below are mandatory unless otherwise commented.

  def self.scim_resource_type
    return Scimitar::Resources::User
  end

  def self.scim_attributes_map
    return {
      id:         :id,
      externalId: :scim_uid,
      userName:   :username,
      name:       {
        givenName:  :first_name,
        familyName: :last_name
      },
      emails: [
        {
          match: 'type',
          with:  'work',
          using: {
            value:   :work_email_address,
            primary: true
          }
        },
        {
          match: 'type',
          with:  'home',
          using: {
            value:   :home_email_address,
            primary: false
          }
        },
      ],
      phoneNumbers: [
        {
          match: 'type',
          with:  'work',
          using: {
            value:   :work_phone_number,
            primary: false
          }
        },
      ],

      # NB The 'groups' collection in a SCIM User resource is read-only, so
      #    we provide no ":find_with" key for looking up records for writing
      #    updates to the associated collection.
      #
      groups: [
        {
          list:  :groups,
          using: {
            value:   :id,
            display: :display_name
          }
        }
      ],
      active: :is_active
    }
  end

  def self.scim_mutable_attributes
    return nil
  end

  # The attributes in this example include a reference to the same hypothesised
  # 'Group' model as in the HABTM relationship above. In this case, in order to
  # filter by "groups" or "groups.value", the 'column' entry must reference the
  # Group model's ID column as an AREL attribute as shown below, and the SCIM
  # controller's #storage_scope implementation must also introduce a #join with
  # ':groups' - see the "Queries & Optimisations" section below.
  #
  def self.scim_queryable_attributes
    return {
      givenName:        { column: :first_name },
      familyName:       { column: :last_name },
      emails:           { column: :work_email_address },
      groups:           { column: Group.arel_table[:id] },
      "groups.value" => { column: Group.arel_table[:id] },
    }
  end

  # Optional but recommended.
  #
  def self.scim_timestamps_map
    {
      created:      :created_at,
      lastModified: :updated_at
    }
  end

  # If you omit any mandatory declarations, you'll get an exception raised by
  # this inclusion which tells you which method(s) need(s) to be added.
  #
  include Scimitar::Resources::Mixin
end

Controllers

ActiveRecord

If you use ActiveRecord, your controllers can potentially be extremely simple by subclassing Scimitar::ActiveRecordBackedResourcesController - at a minimum:

module Scim
  class UsersController < Scimitar::ActiveRecordBackedResourcesController

    protected

      def storage_class
        User
      end

      def storage_scope
        User.all # Or e.g. "User.where(is_deleted: false)" - whatever base scope you require
      end

  end
end

All data-layer actions are taken via #find or #save!, with exceptions such as ActiveRecord::RecordNotFound, ActiveRecord::RecordInvalid or generalised SCIM exceptions handled by various superclasses. For a real Rails example of this, see the test suite's controllers which are invoked via its routing declarations.

Overriding controller methods

You can overwrite write-based controller methods #create, #update, #replace and #destroy in your controller subclass, should you wish, wherein a call to super is passed a block. The block is invoked with the instance of a new unsaved record for #create, the updated record that needs to have its changes saved for #update and #replace and the record that should be destroyed for #destroy. This allows you to do things like applying business logic, default values, extra request-derived data and so-forth before then calling record.save!, or using some different method other than record.destroy! to discard a record (e.g. you might be using soft-delete, or want to skip all callbacks for some reason via record.delete).

Example:

module Scim
  class UsersController < Scimitar::ActiveRecordBackedResourcesController

  # Create all new records with some special internal field set to a value
  # determined by a bespoke-to-your-application mechanism.
  #
  def create
    super do | user |
      user.some_special_on_creation_field = method_that_calculates_value()
      user.save!
    end
  end

  # Use #discard! rather than #destroy! as an example of soft-delete via the
  # 'discard' gem - https://rubygems.org/gems/discard.
  #
  def destroy
    super do | user |
      user.discard!
    end
  end
end

Queries & Optimisations

The scope can be optimised to eager load the data exposed by the SCIM interface, i.e.:

def storage_scope
  User.eager_load(:groups)
end

In cases where you have references to related columns in your scim_queryable_attributes, your storage_scope must join the relation:

def storage_scope
  User.left_join(:groups)
end

Other source types

If you do not use ActiveRecord to store data, or if you have very esoteric read-write requirements, you can subclass Scimitar::ResourcesController in a manner similar to this:

class UsersController < Scimitar::ResourcesController

  # SCIM clients don't use Rails CSRF tokens.
  #
  skip_before_action :verify_authenticity_token

  # If you have any filters you need to run BEFORE authentication done in
  # the superclass (typically set up in config/initializers/scimitar.rb),
  # then use "prepend_before_filter to declare these - else Scimitar's
  # own authorisation before-action filter would always run first.

  def index
    # There's a degree of heavy lifting for arbitrary storage engines.
    query = if params[:filter].present?
      attribute_map = User.new.scim_queryable_attributes() # Note use of *instance* method
      parser        = Scimitar::Lists::QueryParser.new(attribute_map)

      parser.parse(params[:filter])
      # Then use 'parser' to read e.g. #tree or #rpn and turn this into a
      # query object for your storage engine. With ActiveRecord, you could
      # just do: parser.to_activerecord_query(base_scope)
    else
      # Return a query object for 'all results' (e.g. User.all).
    end

    # Assuming the 'query' object above had ActiveRecord-like semantics,
    # you'd create a Scimitar::Lists::Count object with total count filled in
    # via #scim_pagination_info and obtain a page of results with something
    # like the code shown below.
    pagination_info = scim_pagination_info(query.count())
    page_of_results = query.offset(pagination_info.offset).limit(pagination_info.limit).to_a

    super(pagination_info, page_of_results) do | record |
      # Return each instance as a SCIM object, e.g. via Scimitar::Resources::Mixin#to_scim
      record.to_scim(location: url_for(action: :show, id: record.id))
    end
  end

  def show
    super do |user_id|
      user = find_user(user_id)
      # Evaluate to the record as a SCIM object, e.g. via Scimitar::Resources::Mixin#to_scim
      user.to_scim(location: url_for(action: :show, id: user_id))
    end
  end

  def create
    super do |scim_resource|
      # Create an instance based on the Scimitar::Resources::User in
      # "scim_resource" (or whatever your ::storage_class() defines via its
      # ::scim_resource_type class method).
      record = self.storage_class().new
      record.from_scim!(scim_hash: scim_resource.as_json())
      self.save!(record)
      # Evaluate to the record as a SCIM object (or do that via "self.save!")
      user.to_scim(location: url_for(action: :show, id: record.id))
    end
  end

  def replace
    super do |record_id, scim_resource|
      # Fully update an instance based on the Scimitar::Resources::User in
      # "scim_resource" (or whatever your ::storage_class() defines via its
      # ::scim_resource_type class method). For example:
      record = self.find_record(record_id)
      record.from_scim!(scim_hash: scim_resource.as_json())
      self.save!(record)
      # Evaluate to the record as a SCIM object (or do that via "self.save!")
      user.to_scim(location: url_for(action: :show, id: record_id))
    end
  end

  def update
    super do |record_id, patch_hash|
      # Partially update an instance based on the PATCH payload *Hash* given
      # in "patch_hash" (note that unlike the "scim_resource" parameter given
      # to blocks in #create or #replace, this is *not* a high-level object).
      record = self.find_record(record_id)
      record.from_scim_patch!(patch_hash: patch_hash)
      self.save!(record)
      # Evaluate to the record as a SCIM object (or do that via "self.save!")
      user.to_scim(location: url_for(action: :show, id: record_id))
    end
  end

  def destroy
    super do |user_id|
      user = find_user(user_id)
      user.delete
    end
  end

  protected

    # The class including Scimitar::Resources::Mixin which declares mappings
    # to the entity you return in #resource_type.
    #
    def storage_class
      User
    end

    # Find your user. The +id+ parameter is one of YOUR identifiers, which
    # are returned in "id" fields in JSON responses via SCIM schema. If the
    # remote caller (client) doesn't want to remember your IDs and hold a
    # mapping to their IDs, then they do an index with filter on their own
    # "externalId" value and retrieve your "id" from that response.
    #
    def find_user(id)
      # Find records by your ID here.
    end

    # Persist 'user' - for example, if we *were* using ActiveRecord...
    #
    def save!(user)
      user.save!
    rescue ActiveRecord::RecordInvalid => exception
      raise Scimitar::ResourceInvalidError.new(record.errors.full_messages.join('; '))
    end

end

Note that the Scimitar::ApplicationController parent class of Scimitar::ResourcesController has a few methods to help with handling exceptions and rendering them as SCIM responses; for example, if a resource were not found by ID, you might wish to use Scimitar::ApplicationController#handle_resource_not_found.

Extension schema

You can extend schema with custom data by defining an extension class and calling ::extend_schema on the SCIM resource class to which the extension applies. These extension classes:

The ::id class method defines a unique schema ID that is used to namespace payloads or paths in JSON responses describing extended resources, JSON payloads creating them or PATCH paths modifying them. The RFCs require this to be a URN (see RFC 2141). Your extension's ID URN must be globally unique. Depending on your expected use case, you should review the IANA registration considerations that RFC 7643 describes and definitely review the syntactic structure declaration therein (urn:ietf:params:scim:{type}:{name}{:other}).

For example, we might choose to use the RFC-defined User extension schema to define a couple of extra fields our User model happens to support:

class UserEnterpriseExtension < Scimitar::Schema::Base
  def initialize(options = {})
    super(
      name:            'ExtendedUser',
      description:     'Enterprise extension for a User',
      id:              self.class.id,
      scim_attributes: self.class.scim_attributes
    )
  end

  def self.id
    'urn:ietf:params:scim:schemas:extension:enterprise:2.0:User'
  end

  def self.scim_attributes
    [
      Scimitar::Schema::Attribute.new(name: 'organization', type: 'string'),
      Scimitar::Schema::Attribute.new(name: 'department',   type: 'string')
    ]
  end
end

...with the super call providing your choice of name and description, but also always providing id and scim_attributes as shown above. The class name chosen here is just an example and the class can be put inside any level of wrapping namespaces you choose - it's your class that can be named however you like. The extension class is then applied to the SCIM User resource globally in your application by calling:

Scimitar::Resources::User.extend_schema(UserEnterpriseExtension)

This is often done in config/initializers/scimitar.rb to help make it very clear that extensions are globally available and remove the risk of SCIM resources somehow being referenced before schema extensions have been applied.

In def self.scim_attributes_map in the underlying data model, add any new fields - organization and department in this example - to map them to whatever the equivalent data model attributes are, just as you would do with any other resource fields. These are declared without any special nesting - for example:

def self.scim_attributes_map
  return {
    id:           :id,
    externalId:   :scim_uid,
    userName:     :username,
    # ...etc...
    organization: :company,
    department:   :team
  }
end

Whatever you provide in the ::id method in your extension class will be used as a namespace in JSON data. This means that, for example, a SCIM representation of the above resource would look something like this:

{
  "schemas": [
    "urn:ietf:params:scim:schemas:core:2.0:User",
    "urn:ietf:params:scim:schemas:extension:enterprise:2.0:User"
  ],
  "id": "2819c223-7f76-453a-413861904646",
  "externalId": "701984",
  "userName": "bjensen@example.com",
  // ...

  "urn:ietf:params:scim:schemas:extension:enterprise:2.0:User": {
    "organization": "Corporation Incorporated",
    "department": "Marketing",
  },
  // ...
}

...and likewise, creation via POST would require the same nesting if a caller wanted to create a resource instance with those extended properties set (and RFC-compliant consumers of your SCIM API should already be doing this). For PATCH operations, the path uses a colon to separate the ID/URN part from the path rather than just using a dot as you might expect from the JSON nesting above:

{
  "schemas": ["urn:ietf:params:scim:api:messages:2.0:PatchOp"],
  "Operations": [
    {
      "op": "replace",
      "path": "urn:ietf:params:scim:schemas:extension:enterprise:2.0:User:organization",
      "value": "Sales"
    }
  ]
}

IMPORTANT: Attribute names must be unique across your entire combined schema, regardless of URNs used. This is because of a limitation in Scimitar's implementation. This GitHub issue explains more. If this is a problem for you, please comment on the GitHub issue to help the maintainers understand the level of demand for remediation.

Resource extensions can provide any fields you choose, under any ID/URN you choose, to either RFC-described resources or entirely custom SCIM resources. There are no hard-coded assumptions or other "magic" that might require you to only extend RFC-described resources with RFC-described extensions. Of course, if you use custom resources or custom extensions that are not described by the SCIM RFCs, then the SCIM API you provide may only work with custom-written API callers that are aware of your bespoke resources and/or extensions.

Extensions can also contain complex attributes such as groups. For instance, if you want the ability to write to groups from the User resource perspective (since 'groups' collection in a SCIM User resource is read-only), you can add one attribute to your extension like this:

Scimitar::Schema::Attribute.new(name: "userGroups", multiValued: true, complexType: Scimitar::ComplexTypes::ReferenceGroup, mutability: "writeOnly"),

Then map it in your scim_attributes_map:

  userGroups: [
    {
      list: :groups,
      find_with: ->(value) { Group.find(value["value"]) },
      using: {
        value:   :id,
        display: :name
      }
    }
  ]

And write to it like this:

{
  "schemas": ["urn:ietf:params:scim:api:messages:2.0:PatchOp"],
  "Operations": [
    {
      "op": "replace",
      "path": "urn:ietf:params:scim:schemas:extension:enterprise:2.0:User:userGroups",
      "value": [{ "value": "1" }]
    }
  ]
}

Helping with auto-discovery

If you have an API consumer entity querying your Scimitar-based SCIM API provider endpoint and want to enable a degree of auto-discovery for that entity, then depending on your implementation, there may be customisations you wish to make.

Default resources

By default, Scimitar advertises (via things like the /Schemas endpoint) support for both a User and Group resource, but if you (say) only support a User concept, you override the default using code such as this in your config/initializers/scimitar.rb file:

Rails.application.config.to_prepare do
  Scimitar::Engine.set_default_resources([Scimitar::Resources::User])
  # ...other Scimitar configuration / initialisation code...
end

Security

One vital feature of SCIM is its authorisation and security model. The best resource I've found to describe this in any detail is section 2 of the protocol RFC, 7644.

Often, you'll find that bearer tokens are in use by SCIM API consumers, but the way in which this is used by that consumer in practice can vary a great deal. For example, suppose a corporation uses Microsoft Azure Active Directory to maintain a master database of employee details. Azure lets administrators connect to SCIM endpoints for services that this corporation might use. In all cases, bearer tokens are used.

Limitations

Specification versus implementation

If you believe choices made in this section may be incorrect, please create a GitHub issue describing the problem.

Omissions

If you would like to see something listed in the session implemented, please create a GitHub issue asking for it to be implemented, or if possible, implement the feature and send a Pull Request.

Development

Install Ruby dependencies first:

bundle install

Tests

For testing, two main options are available:

Testing on your machine

You will need to have PostgreSQL running. This database is chosen for tests to prove case-insensitive behaviour via detection of ILIKE in generated queries. Using SQLite would have resulted in a more conceptually self-contained test suite, but SQLite is case-insensitive by default and uses "LIKE" either way, making it hard to "see" if the query system is doing the right thing.

After bundle install and with PostgreSQL up, set up the test database with:

pushd spec/apps/dummy
RAILS_ENV=test bundle exec bin/rails db:drop db:create db:migrate
popd

...and thereafter, run tests with:

bundle exec rspec

You can get an idea of arising test coverage by opening coverage/index.html in your preferred web browser.

Testing with Docker (Compose)

In order to be able to utilise the Docker Compose setup, you will need to have Docker installed with the Compose plugin. For an easy installation of Docker (with a GUI and the Compose plugin preinstalled) please see Docker Desktop.

In order to configure the Docker image, run docker compose build in a terminal of your choice, in the root of this project. This will download the required image and install the required libraries. After this is complete, running the tests is as easy as running the command docker compose up test.

As mentioned in the previous section, test coverage may be analysed using coverage/index.html after running the project.

You can also open a raw terminal in this test container by running docker run --rm test sh. For more Compose commands, please refer to the Docker Compose reference manual.

Internal documentation

Locally generated RDoc HTML seems to contain a more comprehensive and inter-linked set of pages than those available from rubydoc.info. You can (re)generate the internal rdoc documentation with:

bundle exec rake rerdoc

...yes, that's rerdoc - Re-R-Doc - then open docs/rdoc/index.html.