anitsh / til

Today I Learn (til) - Github `Issues` used as daily learning management system for taking notes and storing resource links.
https://anitshrestha.com.np
MIT License
77 stars 11 forks source link

Data Model, Data Modelling #829

Open anitsh opened 2 years ago

anitsh commented 2 years ago

image image image image

Data modelling is the process of creating a data model for an information system by applying certain formal techniques.

A data model is an abstract model (or conceptual model) that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be composed of a number of other elements which, in turn, represent the color and size of the car and define its owner.

The term data model can refer to two distinct but closely related concepts. Sometimes it refers to an abstract formalization of the objects and relationships found in a particular application domain: for example the customers, products, and orders found in a manufacturing organization. At other times it refers to the set of concepts used in defining such formalizations: for example concepts such as entities, attributes, relations, or tables. So the "data model" of a banking application may be defined using the entity-relationship "data model". We use the term in both senses.

A data model explicitly determines the structure of data. Data models are typically specified by a data specialist, data librarian, or a digital humanities scholar in a data modeling notation. These notations are often represented in graphical form.[7]

A data model can sometimes be referred to as a data structure, especially in the context of programming languages. Data models are often complemented by function models, especially in the context of enterprise models.

Managing large quantities of structured and unstructured data is a primary function of information systems. Data models describe the structure, manipulation, and integrity aspects of the data stored in data management systems such as relational databases. They may also describe data with a looser structure, such as word processing documents, email messages, pictures, digital audio, and video: XDM, for example, provides a data model for XML documents.

A data model instance may be one of three kinds according to American National Standards Institute.

The significance of this approach is that it allows the three perspectives to be relatively independent of each other. Storage technology can change without affecting either the logical or the conceptual model. The table/column structure can change without (necessarily) affecting the conceptual model. In each case, of course, the structures must remain consistent with the other model. The table/column structure may be different from a direct translation of the entity classes and attributes, but it must ultimately carry out the objectives of the conceptual entity class structure.

Early phases of many software development projects emphasize the design of a conceptual data model. Such a design can be detailed into a logical data model. In later stages, this model may be translated into physical data model. However, it is also possible to implement a conceptual model directly.

Conceptual Modeling

Conceptual modeling is the activity of formally describing some aspects of the physical and social world around us for the purposes of understanding and communication.

The value of a model is usually directly proportional to how well it corresponds to a past, present, future, actual or potential state of affairs. A model of a concept is quite different because in order to be a good model it need not have this real world correspondence.

A conceptual model's primary objective is to convey the fundamental principles and basic functionality of the system which it represents. Also, a conceptual model must be developed in such a way as to provide an easily understood system interpretation for the model's users. A conceptual model, when implemented properly, should satisfy four fundamental objectives.

Enhance an individual's understanding of the representative system
Facilitate efficient conveyance of system details between stakeholders
Provide a point of reference for system designers to extract system specifications
Document the system for future reference and provide a means for collaboration

The conceptual model plays an important role in the overall system development life cycle.

image The role of the conceptual model in a typical system development scheme.

If the conceptual model is not fully developed, the execution of fundamental system properties may not be implemented properly, giving way to future problems or system shortfalls. These failures do occur in the industry and have been linked to; lack of user input, incomplete or unclear requirements, and changing requirements. Those weak links in the system design and development process can be traced to improper execution of the fundamental objectives of conceptual modeling. The importance of conceptual modeling is evident when such systemic failures are mitigated by thorough system development and adherence to proven development objectives/techniques.

As systems have become increasingly complex, the role of conceptual modelling has dramatically expanded. With that expanded presence, the effectiveness of conceptual modeling at capturing the fundamentals of a system is being realized. Building on that realization, numerous conceptual modeling techniques have been created. These techniques can be applied across multiple disciplines to increase the user's understanding of the system to be modeled.

Some techniques:

Data Flow Modeling graphically represents elements of a system to bring the major system functions into context. It does not convey complex system details such as parallel development considerations or timing information.

Entity Relationship Modeling is used software system representation. Entity-relationship diagrams, which are a product of executing this technique, are normally used to represent database models and information systems. The main components of the diagram are the entities and relationships. The entities can represent independent functions, objects, or events. The relationships are responsible for relating the entities to one another. To form a system process, the relationships are combined with the entities and any attributes needed to further describe the process. Multiple diagramming conventions exist for this technique that are just different ways of viewing and organizing the data to represent different system aspects.

The Event-driven Process Chain is used to systematically improve business process flows. It consists of entities/elements and functions that allow relationships to be developed and processed. More specifically, it is made up of events which define what state a process is in or the rules by which it operates. In order to progress through events, a function/ active event must be executed. Depending on the process flow, the function has the ability to transform event states or link to other event driven process chains. Other elements exist within an EPC, all of which work together to define how and by what rules the system operates. The EPC technique can be applied to business practices such as resource planning, process improvement, and logistics.

The conceptual modeling method can sometimes be purposefully vague to account for a broad area of use, the actual application of concept modeling can become difficult. To alleviate this issue, and shed some light on what to consider when selecting an appropriate conceptual modeling technique, there is a framework proposed by Gemino and Wand.

Before evaluating the effectiveness of a conceptual modeling technique for a particular application, an important concept must be understood; Comparing conceptual models by way of specifically focusing on their graphical or top level representations is shortsighted. The emphasis should be placed on a conceptual modeling language when choosing an appropriate technique.

In general, a conceptual model is developed using some form of conceptual modeling technique. That technique will utilize a conceptual modeling language that determines the rules for how the model is arrived at. Understanding the capabilities of the specific language used is inherent to properly evaluating a conceptual modeling technique, as the language reflects the techniques descriptive ability. Also, the conceptual modeling language will directly influence the depth at which the system is capable of being represented, whether it be complex or simple.

A modeling language is any artificial language ( i.e. computer communication terminologies ) that can be used to express information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in the structure.

Some main points to consider when studying the affecting factors: the content that the conceptual model must represent, the method in which the model will be presented, the characteristics of the model's users, and the conceptual model languages specific task.

The conceptual model's content should be considered in order to select a technique that would allow relevant information to be presented. The presentation method for selection purposes would focus on the technique's ability to represent the model at the intended level of depth and detail.

The characteristics of the model's users or participants is an important aspect to consider. A participant's background and experience should coincide with the conceptual model's complexity, else misrepresentation of the system or misunderstanding of key system concepts could lead to problems in that system's realization.

The conceptual model language task will further allow an appropriate technique to be chosen. The difference between creating a system conceptual model to convey system functionality and creating a system conceptual model to interpret that functionality could involve two completely different types of conceptual modeling languages.

The affected variable content of their proposed framework by considering the focus of observation and the criterion for comparison.The focus of observation considers whether the conceptual modeling technique will create a "new product", or whether the technique will only bring about a more intimate understanding of the system being modeled. The criterion for comparison would weigh the ability of the conceptual modeling technique to be efficient or effective. A conceptual modeling technique that allows for development of a system model which takes all system variables into account at a high level may make the process of understanding the system functionality more efficient, but the technique lacks the necessary information to explain the internal processes, rendering the model less effective.

When deciding which conceptual technique to use, answer the listed questions in order to properly evaluate the scope of the conceptual model that would allow us to address some important conceptual modeling considerations.

What content will the conceptual model represent?
How will the conceptual model be presented?
Who will be using or participating in the conceptual model?
How will the conceptual model describe the system?
What is the conceptual models focus of observation?
Will the conceptual model be efficient or effective in describing the system?

Understanding the conceptual models scope will lead to a more informed selection of a technique that properly addresses that particular model. It provide a rational and factual basis for assessment of simulation application appropriateness.

General model theory

A model is a simplifying image of reality. The image can be either a sensory, above all optically observable artifact or given purely theoretically.

According to Herbert Stachowiak, a model is characterized by at least three properties:

  1. Mapping - A model always is a model of something. It is an image or representation of some natural or artificial, existing or imagined original, where this original itself could be a model.

  2. Reduction - In general, a model will not include all attributes that describe the original but only those that appear as relevant to the model's creator or user.

  3. Pragmatism - A model does not relate unambiguously to its original. It is intended to work as a replacement for the original a) for certain subjects (for whom?) b) within a certain time range (when?) c) restricted to certain conceptual or physical actions (what for?).

For example, a street map is a model of the actual streets in a city (mapping), showing the course of the streets while leaving out, say, traffic signs and road markings (reduction), made for pedestrians and vehicle drivers for the purpose of finding one's way in the city (pragmatism).

Additional properties have been proposed, like extension and distortion as well as validity. The American philosopher Michael Weisberg differentiates between concrete and mathematical models and proposes computer simulations (computational models) as their own class of models.

System Model

A system model is the conceptual model that describes and represents the structure, behavior, and more views of a system. A system model can represent multiple views of a system by using two different approaches, architectural and non-architectural.

The architectural approach, also known as system architecture, instead of picking many heterogeneous and unrelated models, will use only one integrated architectural model.

The non-architectural approach respectively picks a model for each view.

Business Process Modelling

In business process modelling the enterprise process model is often referred to as the business process model. Process models are core concepts in the discipline of process engineering.

Process models are:

The same process model is used repeatedly for the development of many applications and thus, has many instantiations.

One possible use of a process model is to prescribe how things must/should/could be done in contrast to the process itself which is really what happens. A process model is roughly an anticipation of what the process will look like. What the process shall be will be determined during actual system development.

Conceptual Framework

A conceptual framework is an analytical tool with several variations and contexts. It can be applied in different categories of work where an overall picture is needed. It is used to make conceptual distinctions and organize ideas. Strong conceptual frameworks capture something real and do this in a way that is easy to remember and apply.

Resource

anitsh commented 2 years ago

A well-thought-out and complete data model is the key to the development of a truly functional, useful, secure, and accurate database. Start with the conceptual model to lay out all the components and functions of the data model. Then refine those plans into a logical data model that describes the data flows and clarifies the definition of what data is needed and how it will be acquired, handled, stored, and distributed. The logical data model drives the physical data model that is specific to a database product and is the detailed design document that guides the creation of the database and application software.

Good data modeling and database design are essential to the development of functional, reliable, and secure application systems and databases that work well with data warehouses and analytical tools – and facilitate data exchange with business partners and among multiple application sets. Well-thought-out data models help ensure data integrity, making your company’s data even more valuable and reliable.

Data modeling is the process of creating a visual representation of either a whole information system or parts of it to communicate connections between data points and structures. The goal is to illustrate the types of data used and stored within the system, the relationships among these data types, the ways the data can be grouped and organized and its formats and attributes.

Discover how data modeling uses abstraction to represent and better understand the nature of data flow within an enterprise information system

Data models are built around business needs. Rules and requirements are defined upfront through feedback from business stakeholders so they can be incorporated into the design of a new system or adapted in the iteration of an existing one.

Data can be modeled at various levels of abstraction. The process begins by collecting information about business requirements from stakeholders and end users. These business rules are then translated into data structures to formulate a concrete database design. A data model can be compared to a roadmap, an architect’s blueprint or any formal diagram that facilitates a deeper understanding of what is being designed.

Data modeling employs standardized schemas and formal techniques. This provides a common, consistent, and predictable way of defining and managing data resources across an organization, or even beyond.

Ideally, data models are living documents that evolve along with changing business needs. They play an important role in supporting business processes and planning IT architecture and strategy. Data models can be shared with vendors, partners, and/or industry peers.

Like any design process, database and information system design begins at a high level of abstraction and becomes increasingly more concrete and specific. Data models can generally be divided into three categories, which vary according to their degree of abstraction. The process will start with a conceptual model, progress to a logical model and conclude with a physical model. Each type of data model is discussed in more detail below:

Conceptual data models. They are also referred to as domain models and offer a big-picture view of what the system will contain, how it will be organized, and which business rules are involved. Conceptual models are usually created as part of the process of gathering initial project requirements. Typically, they include entity classes (defining the types of things that are important for the business to represent in the data model), their characteristics and constraints, the relationships between them and relevant security and data integrity requirements. Any notation is typically simple. image

Logical data models. They are less abstract and provide greater detail about the concepts and relationships in the domain under consideration. One of several formal data modeling notation systems is followed. These indicate data attributes, such as data types and their corresponding lengths, and show the relationships among entities. Logical data models don’t specify any technical system requirements. This stage is frequently omitted in agile or DevOps practices. Logical data models can be useful in highly procedural implementation environments, or for projects that are data-oriented by nature, such as data warehouse design or reporting system development. image

Physical data models. They provide a schema for how the data will be physically stored within a database. As such, they’re the least abstract of all. They offer a finalized design that can be implemented as a relational database, including associative tables that illustrate the relationships among entities as well as the primary keys and foreign keys that will be used to maintain those relationships. Physical data models can include database management system (DBMS)-specific properties, including performance tuning. image

Data modeling has evolved alongside database management systems, with model types increasing in complexity as businesses' data storage needs have grown. Here are several model types:

Hierarchical data models represent one-to-many relationships in a treelike format. In this type of model, each record has a single root or parent which maps to one or more child tables. This model was implemented in the IBM Information Management System (IMS), which was introduced in 1966 and rapidly found widespread use, especially in banking. Though this approach is less efficient than more recently developed database models, it’s still used in Extensible Markup Language (XML) systems and geographic information systems (GISs).
Relational data models were initially proposed by IBM researcher E.F. Codd in 1970. They are still implemented today in the many different relational databases commonly used in enterprise computing. Relational data modeling doesn’t require a detailed understanding of the physical properties of the data storage being used. In it, data segments are explicitly joined through the use of tables, reducing database complexity.

Relational databases frequently employ structured query language (SQL) for data management. These databases work well for maintaining data integrity and minimizing redundancy. They’re often used in point-of-sale systems, as well as for other types of transaction processing.

Entity-relationship (ER) data models use formal diagrams to represent the relationships between entities in a database. Several ER modeling tools are used by data architects to create visual maps that convey database design objectives.
Object-oriented data models gained traction as object-oriented programming and it became popular in the mid-1990s. The “objects” involved are abstractions of real-world entities. Objects are grouped in class hierarchies, and have associated features. Object-oriented databases can incorporate tables, but can also support more complex data relationships. This approach is employed in multimedia and hypertext databases as well as other use cases.
Dimensional data models were developed by Ralph Kimball, and they were designed to optimize data retrieval speeds for analytic purposes in a [data warehouse](https://www.ibm.com/cloud/learn/data-warehouse). While relational and ER models emphasize efficient storage, dimensional models increase redundancy in order to make it easier to locate information for reporting and retrieval. This modeling is typically used across [OLAP](https://www.ibm.com/cloud/learn/olap) systems.

Two popular dimensional data models are the star schema, in which data is organized into facts (measurable items) and dimensions (reference information), where each fact is surrounded by its associated dimensions in a star-like pattern. The other is the snowflake schema, which resembles the star schema but includes additional layers of associated dimensions, making the branching pattern more complex.