Information content security on the Internet: the control model and its evaluation (SCIS 2010)

wkrp commented 1 year ago

The next live reading group discussion will have a twist: this time we'll read a paper that has a pro-censorship point of view, coauthored by one of the creators of the Great Firewall.

Sunday, 2023-06-04 13:00–14:00

"Information content security on the Internet: the control model and its evaluation" 方滨兴 (Fang Binxing), 郭云川 (Guo Yunchuan), 周渊 (Zhou Yuan) PDF

The paper is basically a modeling paper. It studies network filtering (which the authors call "information content security") as an access control problem. There is a lot of math notation, but don't let that scare you: it is not as sophisticated as it is trying to appear. A lot of the formalism is gratuitous and not very significant, and there are minor notational inconsistencies throughout. The three subsections of Section 3.2 are basically the same information copied three times. The big ideas to look out for are the division into three sub-models: content-based ("what is the content of the communication"), identity-based ("who communicates with whom"), and behavior-based ("how do they communicate"); and the evaluation in terms of false positive and false negative rates on the axes of technology and society.

The first author Fang Binxing's name should be familiar to anyone who has studied Internet censorship in China. He helped lead the initial development of the Great Firewall and continues to be involved with its maintenance. But don't be intimidated by his distinguished record: Fang is a VPN user just like the rest of us.

wkrp commented 1 year ago

Reference [19] is potentially interesting. It is cited in Section 3.1, in the context of optimizing the placement of capture devices in a communications graph.

Bao YB, Huo LT, Shi JQ, et al. "A formalized model and application of resource discovery" (in Chinese). In: Proceedings of Conference on National Computer Network Emergency Response Technical Team of China, Shenzhen, 2008. 283–290.

Can anyone find a copy? The original title is likely Chinese, not English. The conference is possibly the same as the China Cyber Security Annual Conference (中国网络安全年会) organized by CNCERT. There are online proceedings but they only go back to 2018 (see also dblp). There is a Baike article about the 2015 conference.

Gowee commented 1 year ago

https://archive.org/search?query=creator%3A%22%E6%96%B9%E6%BB%A8%E5%85%B4%22

wkrp commented 1 year ago

Wow, thank you!

wkrp commented 1 year ago

Information content security on the Internet: the control model and its evaluation 方滨兴 (Fang Binxing), 郭云川 (Guo Yunchuan), 周渊 (Zhou Yuan) https://www.sciengine.com/SCIS/doi/10.1007/s11432-010-0014-z

The paper presents a model of network filtering, here called "information content security" (ICS). Its approach is to treat ICS as an access control problem. But instead of controlling access to its own information, the network monitor controls access to outside information. Because the outside publishers do not cooperate with the monitor and may actually want their information to be freely available, some changes from the traditional understanding of network control are required. The most notable change is the placement of the detection and enforcement device, which the authors call the reference monitor (RM). Since client-side and server-side placement of the RM is not available, the authors propose placing the RM in the communications path between the client and server, creating what they call a network-side reference monitor (NRM), and which you can understand as a firewall or DPI middlebox. The paper presents a framework for information content security in the NRM model, and discusses ways of evaluating its effectiveness.

Their control model, called ICCON, is the fusion of three sub-models:

ICCON_C, content-based, "what is the content of the communication"
ICCON_I, identity-based, "who communicates with whom"
ICCON_D, behavior-based, "how do they communicate"

Content-based control (think HTTP keyword filtering) has the finest granularity, identity-based control (think IP address and SNI blocking) has coarser granularity, and behavior-based control (think analyzing packet flows, correlating multiple connections) is coarser still. Although all three modules are meant to be used together, the text seems to show a preference order: content-based if it is available, identity-based if content features are not available, and behavior-based as a last resort if no other features are available:

For example, if the content can be effectively obtained and authenticated, then the content-based control can be used. But if the content is encrypted, then we cannot effectively authenticate it, which makes the content-based control fail. In the case, if the identity (such as the IP address) of communicators can be obtained, then the identity-based control can be adopted. However, if disseminators employ the anonymity technologies to hide their identities, then we cannot obtain their real identity. In the situation, we should authenticate the behavior of communication, i.e. the behavior-based control should be employed.

Section 3 breaks the ICS control process into three stages: information acquisition, information authentication, and response. (Compare to Tschantz et al.'s division of censorship into "detection" and "action", and to the condition–strategy pairs of the patent application CN109391590.) Each stage has its own quality metrics, which will figure into the evaluation.

Information acquisition: Information acquisition consists of two parts: capture and extraction. The authors claim that extraction is trivial (it's just protocol parsing), so they focus on capture. Channel capture (Definition 1) means choosing a set of edges on a communication graph to maximize the number of paths observed between sets of distinguished source and destination nodes. The quality metric of this stage is the channel capture rate CapCh, which balances the size of the capture set against the number of paths observed.
Information authentication: Information authentication means checking the information acquired in the previous stage against access control policy rules. Content-based and identity-based authentication are treated about the same, matching captured content against a set of templates and outputting a response score. Behavior-based authentication is additionally described as requiring not just one, but a sequence of actions, which they formalize as a finite state automaton. The quality metrics of this stage are the false classification rates FPR and FNR.
Response: The authors do not get into details in the response stage, except to say that response can be active (blocking) or passive (logging). The quality metric of this stage is the response efficiency BlockCh.

The evaluation combines the quality metrics from the three stages of the control process. The authors emphasize that the FPR and FNR of information authentication must be evaluated on two axes, the technological and the societal. The technological evaluation tests an RM's classification performance against its own intended policy rules. The societal evaluation additionally measures how well those policy rules contribute to social goals. Section 2.2 and Example 18 give the example of keyword filtering to block pornography. The technological goal is to block all pages that contain a "sex" keyword; the societal goal is to block access to pornography. Because there are many pages that contain the keyword that are not pornographic, a filter with 100% effectiveness in blocking the keyword will overblock from a societal perspective: the technological evaluation will be high while the societal evaluation will be low.

wkrp commented 1 year ago

The reading group for "Information content security on the Internet" will start 19 hours from now at 2023-06-04 13:00.

https://meet.jit.si/moderated/c70722d46b61661b535d359ebfc1a14dba396dc82bdccfc28f260218d1425418

As usual, I'll try to get the call started about 20 minutes early, to give time to sort out connections problems, and I will post a recording after the fact.

wkrp commented 1 year ago

Link to video

Here is the video of the discussion.

Links to references:

net4people / bbs

Information content security on the Internet: the control model and its evaluation (SCIS 2010) #251