IT-bestpractices / root

This is our root repository.
18 stars 3 forks source link

Readablity of json #3

Closed h4ck3rm1k3 closed 8 years ago

h4ck3rm1k3 commented 9 years ago

Hi there, It seems to me that html formatting embedded in json is not very optimal. The documents are not as readable as they should be. I have converted the docs to yaml which is basically a superset of json that allows for better formatting see my branch yaml https://github.com/h4ck3rm1k3/root/blob/yaml/rules/os.app/security.domain/posix.class/linux.os/redhat/nist_V-38437.yaml

of course you can transform them back and forth see github.com:drbild/json2yaml but I think that we are going to need some way to render these docs. I am of the opinion that for this scheme we could convert them to mark down and back based on the formatting given. I will think about a program to manage that.

h4ck3rm1k3 commented 9 years ago

here is output of json2html and then html2 markdown. obviously not a solution for editing but something to think about. https://github.com/bloopletech/json2html/tree/master

Object

Name Value
severity medium
tags
![](images/min.gif)
### Object
Name Value
itbp
![](images/min.gif)
### Array
Index Value
0 itbp-00001
text
![](images/min.gif)
### Object
Name Value
en
![](images/min.gif)
### Object
Name Value
check To check that the correct queue discipline is enabled to avoid buffer bloat, execute the following command: # cat /proc/sys/net/core/default_qdisc The preferred result is 'fq_codel'. CoDel is also an acceptable result. All other values result in buffer bloat
fix To permanently fix this issue, add the following line to /etc/sysctl.conf. This will take effect when the machine reboots. net.core.default_qdisc fq_codel An immediate temporary fix can be accomplished by executing this command: echo fq_codel > /proc/sys/net/core/default_qdisc
long_description The default network queuing discipline should avoid buffer bloat - which destroys latency. net.core.default_qdisc sets the default queuing mechanism for Linux networking. It has very significant effects on network performance and latency. fq_codel is the current best queuing discipline for performance and latency on Linux machines. It is the current best practice for controlling buffer bloat. As of October 2014, the second best discipline is CoDel. For details from an entertaining 2014 presentation clearly explaining buffer bloat at the Linux Plumbers Conference by Stephen Hemminger[1], see http://lwn.net/Articles/616241/
short_description The default network queuing discipline should avoid buffer bloat.
h4ck3rm1k3 commented 9 years ago

Here is an example of an equivalent markdown that is easy to edit and parse.

itbp-00001

short_description

The default network queuing discipline should avoid buffer bloat.

check

To check that the correct queue discipline is enabled to avoid buffer bloat, execute the following command:

cat /proc/sys/net/core/default_qdisc

The preferred result is 'fq_codel'. CoDel is also an acceptable result. All other values result in buffer bloat fix To permanently fix this issue, add the following line to /etc/sysctl.conf. This will take effect when the machine reboots.

net.core.default_qdisc fq_codel

FIXME: what does this line mean?

An immediate temporary fix can be accomplished by executing this command:

echo fq_codel > /proc/sys/net/core/default_qdisc

long_description

The default network queuing discipline should avoid buffer bloat - which destroys latency. net.core.default_qdisc sets the default queuing mechanism for Linux networking. It has very significant effects on network performance and latency. fq_codel is the current best queuing discipline for performance and latency on Linux machines. It is the current best practice for controlling buffer bloat. As of October 2014, the second best discipline is CoDel. For details from an entertaining 2014 presentation clearly explaining buffer bloat at the Linux Plumbers Conference by Stephen Hemminger1

Alan-R commented 9 years ago

So, the point of these documents is to deliver them by a web server over the internet to another program which in turn gives them to a web browser in the context of its operation. There are two file formats that are used when two machines are communicating with each other over the internet: JSON and XML. I really don't care for XML, and opted for JSON instead.

So, the original readability is not the point. The point is to deliver the results to another program in JSON. The transformation from YAML to JSON is lossy. Some YAML information cannot be transformed to JSON, so the loss of expected information would be considered surprising.

By the way, the example you picked is the only one with HTML formatting in it. On the other hand, hyperlinks are useful, in order to provide even deeper information. In the end, it is expected that this will be rendered to HTML5, so embedding an HTML subset makes sense. Bold, italics, teletype, hyperlinks and perhaps even images (although I'm not so sure about those) make sense.

If you want to transform everything to YAML, you should just come up with a different version of my python program and run it again to produce YAML. But you still have the issue of what subset of YAML transforms without loss to JSON. Because in the end, we need to render them into JSON for the programs that want to use these files.

Also keep in mind that we will need to update these files as the corresponding base sets of checks improve, so how to do that smoothly needs to be a reasonably serious consideration.

On 07/10/2015 07:13 AM, James Michael DuPont wrote:

Here is an example of an equivalent markdown that is easy to edit and parse.

itbp-00001

short_description

The default network queuing discipline should avoid buffer bloat.

check

To check that the correct queue discipline is enabled to avoid buffer bloat, execute the following command:

cat /proc/sys/net/core/default_qdisc

The preferred result is |'fq_codel'|. |CoDel| is also an acceptable result. All other values result in buffer bloat fix To permanently fix this issue, add the following line to |/etc/sysctl.conf|. This will take effect when the machine reboots.

net.core.default_qdisc fq_codel

FIXME: what does this line mean?

An immediate temporary fix can be accomplished by executing this command:

echo fq_codel > /proc/sys/net/core/default_qdisc
long_description

The default network queuing discipline should avoid buffer bloat - which destroys latency. net.core.default_qdisc sets the default queuing mechanism for Linux networking. It has very significant effects on network performance and latency. _fqcodel is the current best queuing discipline for performance and latency on Linux machines. It is the current best practice for controlling buffer bloat. As of October 2014, the second best discipline is CoDel. For details from an entertaining 2014 presentation clearly explaining buffer bloat at the Linux Plumbers Conference by Stephen Hemminger1 http://lwn.net/Articles/616241/

— Reply to this email directly or view it on GitHub https://github.com/IT-bestpractices/root/issues/3#issuecomment-120407848.

Alan Robertson / CTO AlanR@AssimilationSystems.com mailto:AlanR@AssimilationSystems.com/ +1 303.947.7999

Assimilation Systems Limited http://AssimilationSystems.com

Twitter https://twitter.com/ossalanr Linkedin https://www.linkedin.com/in/alanr skype https://htmlsig.com/skype?username=alanr_unix.sh

h4ck3rm1k3 commented 9 years ago

OK, I think I understand better now. I guess I will have to look at that program first before making any more comments. Do you have any links to this program? My preferred method of editing is emacs.

On 7/10/15, Alan Robertson notifications@github.com wrote:

So, the point of these documents is to deliver them by a web server over the internet to another program which in turn gives them to a web browser in the context of its operation. There are two file formats that are used when two machines are communicating with each other over the internet: JSON and XML. I really don't care for XML, and opted for JSON instead.

So, the original readability is not the point. The point is to deliver the results to another program in JSON. The transformation from YAML to JSON is lossy. Some YAML information cannot be transformed to JSON, so the loss of expected information would be considered surprising.

By the way, the example you picked is the only one with HTML formatting in it. On the other hand, hyperlinks are useful, in order to provide even deeper information. In the end, it is expected that this will be rendered to HTML5, so embedding an HTML subset makes sense. Bold, italics, teletype, hyperlinks and perhaps even images (although I'm not so sure about those) make sense.

If you want to transform everything to YAML, you should just come up with a different version of my python program and run it again to produce YAML. But you still have the issue of what subset of YAML transforms without loss to JSON. Because in the end, we need to render them into JSON for the programs that want to use these files.

Also keep in mind that we will need to update these files as the corresponding base sets of checks improve, so how to do that smoothly needs to be a reasonably serious consideration.

On 07/10/2015 07:13 AM, James Michael DuPont wrote:

Here is an example of an equivalent markdown that is easy to edit and parse.

itbp-00001

short_description

The default network queuing discipline should avoid buffer bloat.

check

To check that the correct queue discipline is enabled to avoid buffer bloat, execute the following command:

cat /proc/sys/net/core/default_qdisc

The preferred result is |'fq_codel'|. |CoDel| is also an acceptable result. All other values result in buffer bloat fix To permanently fix this issue, add the following line to |/etc/sysctl.conf|. This will take effect when the machine reboots.

net.core.default_qdisc fq_codel

FIXME: what does this line mean?

An immediate temporary fix can be accomplished by executing this command:

echo fq_codel > /proc/sys/net/core/default_qdisc
long_description

The default network queuing discipline should avoid buffer bloat - which destroys latency. net.core.default_qdisc sets the default queuing mechanism for Linux networking. It has very significant effects on network performance and latency. _fqcodel is the current best queuing discipline for performance and latency on Linux machines. It is the current best practice for controlling buffer bloat. As of October 2014, the second best discipline is CoDel. For details from an entertaining 2014 presentation clearly explaining buffer bloat at the Linux Plumbers Conference by Stephen Hemminger1 http://lwn.net/Articles/616241/

— Reply to this email directly or view it on GitHub https://github.com/IT-bestpractices/root/issues/3#issuecomment-120407848.

Alan Robertson / CTO AlanR@AssimilationSystems.com mailto:AlanR@AssimilationSystems.com/ +1 303.947.7999

Assimilation Systems Limited http://AssimilationSystems.com

Twitter https://twitter.com/ossalanr Linkedin https://www.linkedin.com/in/alanr skype https://htmlsig.com/skype?username=alanr_unix.sh


Reply to this email directly or view it on GitHub: https://github.com/IT-bestpractices/root/issues/3#issuecomment-120414833

James Michael DuPont Kansas Linux Fest http://kansaslinuxfest.us Free/Libre Open Source and Open Knowledge Association of Kansas http://openkansas.us Member of Free Libre Open Source Software Kosova http://www.flossk.org Saving Wikipedia(tm) articles from deletion http://SpeedyDeletion.wikia.com

Alan-R commented 9 years ago

It's in the github repository. Tools directory maybe?

On July 10, 2015 9:13:50 PM MDT, James Michael DuPont notifications@github.com wrote:

OK, I think I understand better now. I guess I will have to look at that program first before making any more comments. Do you have any links to this program? My preferred method of editing is emacs.

On 7/10/15, Alan Robertson notifications@github.com wrote:

So, the point of these documents is to deliver them by a web server over the internet to another program which in turn gives them to a web browser in the context of its operation. There are two file formats that are used when two machines are communicating with each other over the internet: JSON and XML. I really don't care for XML, and opted for JSON instead.

So, the original readability is not the point. The point is to deliver the results to another program in JSON. The transformation from YAML to JSON is lossy. Some YAML information cannot be transformed to JSON, so the loss of expected information would be considered surprising.

By the way, the example you picked is the only one with HTML formatting in it. On the other hand, hyperlinks are useful, in order to provide even deeper information. In the end, it is expected that this will be rendered to HTML5, so embedding an HTML subset makes sense. Bold, italics, teletype, hyperlinks and perhaps even images (although I'm not so sure about those) make sense.

If you want to transform everything to YAML, you should just come up with a different version of my python program and run it again to produce YAML. But you still have the issue of what subset of YAML transforms without loss to JSON. Because in the end, we need to render them into JSON for the programs that want to use these files.

Also keep in mind that we will need to update these files as the corresponding base sets of checks improve, so how to do that smoothly needs to be a reasonably serious consideration.

On 07/10/2015 07:13 AM, James Michael DuPont wrote:

Here is an example of an equivalent markdown that is easy to edit and parse.

itbp-00001

short_description

The default network queuing discipline should avoid buffer bloat.

check

To check that the correct queue discipline is enabled to avoid buffer bloat, execute the following command:

cat /proc/sys/net/core/default_qdisc

The preferred result is |'fq_codel'|. |CoDel| is also an acceptable result. All other values result in buffer bloat fix To permanently fix this issue, add the following line to |/etc/sysctl.conf|. This will take effect when the machine reboots.

net.core.default_qdisc fq_codel

FIXME: what does this line mean?

An immediate temporary fix can be accomplished by executing this command:

echo fq_codel > /proc/sys/net/core/default_qdisc
long_description

The default network queuing discipline should avoid buffer bloat - which destroys latency. net.core.default_qdisc sets the default queuing mechanism for Linux networking. It has very significant effects on network performance and latency. _fqcodel is the current best queuing discipline for performance and latency on Linux machines. It is the current best practice for controlling buffer bloat. As of October 2014, the second best discipline is CoDel. For details from an entertaining 2014 presentation clearly explaining buffer bloat at the Linux Plumbers Conference by Stephen Hemminger1 http://lwn.net/Articles/616241/

— Reply to this email directly or view it on GitHub

https://github.com/IT-bestpractices/root/issues/3#issuecomment-120407848.

Alan Robertson / CTO AlanR@AssimilationSystems.com mailto:AlanR@AssimilationSystems.com/ +1 303.947.7999

Assimilation Systems Limited http://AssimilationSystems.com

Twitter https://twitter.com/ossalanr Linkedin https://www.linkedin.com/in/alanr skype https://htmlsig.com/skype?username=alanr_unix.sh


Reply to this email directly or view it on GitHub:

https://github.com/IT-bestpractices/root/issues/3#issuecomment-120414833

James Michael DuPont Kansas Linux Fest http://kansaslinuxfest.us Free/Libre Open Source and Open Knowledge Association of Kansas http://openkansas.us Member of Free Libre Open Source Software Kosova http://www.flossk.org Saving Wikipedia(tm) articles from deletion http://SpeedyDeletion.wikia.com


Reply to this email directly or view it on GitHub: https://github.com/IT-bestpractices/root/issues/3#issuecomment-120565932

Sent from my Android device with K-9 Mail. Please excuse my brevity.

h4ck3rm1k3 commented 9 years ago
Alan-R commented 9 years ago

Look here https://github.com/IT-bestpractices/root/blob/master/tools/chopstig.py at line 148:

contents = json.dumps(stig_to_itbestpractices(stig), sort_keys=True,
                      indent=2, separators=(', ', ': '))

This is the line that converts the format I created into JSON. I'm sure it'd just be a different one-liner for converting it to YAML. You'd have to have a different import, and remove the "import json".

As it stands, the code passes pylint with no flags or configuration. Any replacement code would have to do the same. But I have to fix the license statement - you can't do that. That was just boilerplate I copied over from the Assimilation Project.

On 07/11/2015 05:46 AM, Alan Robertson wrote:

It's in the github repository. Tools directory maybe?

On July 10, 2015 9:13:50 PM MDT, James Michael DuPont notifications@github.com wrote:

OK, I think I understand better now. I guess I will have to look at
that program first before making any more comments. Do you have any
links to this program? My preferred method of editing is emacs.

On 7/10/15, Alan Robertson <notifications@github.com> wrote:
> So, the point of these documents is to deliver them by a web
server over
> the internet to another program which in turn gives them to a web
> browser in the context of its operation. There are two file formats
> that are used when two machines are communicating with each
other over
> the internet: JSON and XML. I really don't care for XML, and
opted for
> JSON instead.
>
> So, the original readability is not the point. The point is to
deliver
> the results to another program in JSON. The transformation from
YAML to
> JSON is lossy. Some YAML information cannot be transformed to
JSON, so
> the loss of expected information would be considered surprising.
>
> By the way, the example you picked is the *only* one with HTML
> formatting in it. On the other hand, hyperlinks are useful, in
order to
> provide even deeper information. In the end, it is expected that
this
> will be rendered to HTML5, so embedding an HTML subset makes sense.
> Bold, italics, teletype, hyperlinks and perhaps even images
(although
> I'm not so sure about those) make sense.
>
> If you want to transform everything to YAML, you should just come up
> with a different version of my python program and run it again to
> produce YAML. But you still have the issue of what subset of YAML
> transforms without loss to JSON. Because in the end, we need to
render
> them into JSON for the programs that want to use these files.
>
> Also keep in mind that we will need to update these files as the
> corresponding base sets of checks improve, so how to do that
smoothly
> needs to be a reasonably serious consideration.
>
>
> On 07/10/2015 07:13 AM, James Michael DuPont wrote:
>>
>> Here is an example of an equivalent markdown that is easy to
edit and
>> parse.
>>
>>
>> itbp-00001
>>
>> ##short_description
>>
>> The default network queuing discipline should avoid buffer bloat.
>>
>>
>> check
>>
>> To check that the correct queue discipline is enabled to avoid
buffer
>> bloat, execute the following command:
>>
>> |cat /proc/sys/net/core/default_qdisc
>> |
>>
>> The preferred result is |'fq_codel'|. |CoDel| is also an acceptable
>> result. All other values result in buffer bloat
>> fix
>> To permanently fix this issue, add the following line to
>> |/etc/sysctl.conf|. This will take effect when the machine reboots.
>>
>> |net.core.default_qdisc fq_codel
>> |
>>
>> *FIXME: what does this line mean?*
>>
>> An immediate temporary fix can be accomplished by executing
this command:
>>
>> |echo fq_codel > /proc/sys/net/core/default_qdisc
>> |
>>
>>
>> long_description
>>
>> The default network queuing discipline should avoid buffer bloat -
>> which destroys latency. net.core.default_qdisc sets the default
>> queuing mechanism for Linux networking. It has very significant
>> effects on network performance and latency.
>> *fq_codel* is the current best queuing discipline for
performance and
>> latency on Linux machines. It is the current best practice for
>> controlling buffer bloat.
>> As of October 2014, the second best discipline is *CoDel*.
>> For details from an entertaining 2014 presentation clearly
explaining
>> buffer bloat at the Linux Plumbers Conference by Stephen Hemminger1
>> <http://lwn.net/Articles/616241/>
>>
>> —
>> Reply to this email directly or view it on GitHub
>>
<https://github.com/IT-bestpractices/root/issues/3#issuecomment-120407848>.
>>
>
>
> --
>
> Alan Robertson / CTO
> AlanR@AssimilationSystems.com
<mailto:AlanR@AssimilationSystems.com>/ +1
> 303.947.7999
>
> Assimilation Systems Limited
> http://AssimilationSystems.com
>
> Twitter <https://twitter.com/ossalanr> Linkedin
> <https://www.linkedin.com/in/alanr> skype
> <https://htmlsig.com/skype?username=alanr_unix.sh>
>
>
>
> ---
> Reply to this email directly or view it on GitHub:
>
https://github.com/IT-bestpractices/root/issues/3#issuecomment-120414833

Sent from my Android device with K-9 Mail. Please excuse my brevity.

Alan Robertson / CTO AlanR@AssimilationSystems.com mailto:AlanR@AssimilationSystems.com/ +1 303.947.7999

Assimilation Systems Limited http://AssimilationSystems.com

Twitter https://twitter.com/ossalanr Linkedin https://www.linkedin.com/in/alanr skype https://htmlsig.com/skype?username=alanr_unix.sh

Alan-R commented 9 years ago

OK. I wrote a version of that tools file that will create YAML instead of JSON. It barely made any difference in readability. This surprised me. I thought it might be bug in my code, but eventually I figured out it's because of the frequent use of the # character. If a string contains a # or { or } or various other characters [ ], then YAML has to fall back to using quoted (") strings and escaping newlines - exactly like JSON. So, after a day's work, I've decided that it doesn't help much - because so many of the long lines contain things which run at a root prompt - and therefore have a # character in the description. Sigh... So, I think I'm going to shelve this. I think YAML is a syntax that doesn't deliver as much as it adds in complexity. I could support either input format, but I'm not sure it adds much to be able to do that. I think simplicity and consistency and a very familiar (if limited) file format would be a better choice than the many complex and surprising ways that YAML provides to format data. If it weren't for the nearly-universal # in the documents, it might be a better choice...

h4ck3rm1k3 commented 9 years ago

I think the issue of readability will come when you use a markdown or other ascii readable and writable encoding. see http://asciidoctor.org/ or http://pandoc.org/ or http://sphinx-doc.org/

On 7/29/15, Alan Robertson notifications@github.com wrote:

OK. I wrote a version of that tools file that will create YAML instead of JSON. It barely made any difference in readability. This surprised me. I thought it might be bug in my code, but eventually I figured out it's because of the frequent use of the # character. If a string contains a # or { or } or various other characters [ ], then YAML has to fall back to using quoted (") strings and escaping newlines - exactly like JSON. So, after a day's work, I've decided that it doesn't help much - because so many of the long lines contain things which run at a root prompt - and therefore have a

character in the description. Sigh... So, I think I'm going to shelve

this. I think YAML is a syntax that doesn't deliver as much as it adds in complexity. I could support either input format, but I'm not sure it adds much to be able to do that. I think simplicity and consistency and a very familiar (if limited) file format would be a better choice than the many complex and surprising ways that YAM L provides to format data. If it weren't for the nearly-universal # in the documents, it might be a better choice...


Reply to this email directly or view it on GitHub: https://github.com/IT-bestpractices/root/issues/3#issuecomment-126140769

James Michael DuPont Kansas Linux Fest http://kansaslinuxfest.us Free/Libre Open Source and Open Knowledge Association of Kansas http://openkansas.us Member of Free Libre Open Source Software Kosova http://www.flossk.org Saving Wikipedia(tm) articles from deletion http://SpeedyDeletion.wikia.com

Alan-R commented 9 years ago

There are dozens or maybe even hundreds of markdown schemes (you named three) - and they're all different - and none are standards. The point of this is to provide something for a program to use in displaying in their context to humans. Most of those programs have ready access to HTML rendering - which is why in exactly one description I experimented with using HTML.

The rest of them only require that newlines not be eliminated in order for them to be readable. That's pretty simple. It is sort-of markdown-like, but it's not at all complex. Whatever we do it needs to be simple and universally available.

The main thing is the content, not the format.

h4ck3rm1k3 commented 9 years ago

Let me ask this, what is the point then of html in json, would it not better to use straight html? and then we can just make html with certain tags. It does not seem to me that the data is very complex.

borgified commented 9 years ago
h4ck3rm1k3 commented 9 years ago

so if you use markdown you can display it and edit it in github like a wiki, that is why it is good on github.

Alan-R commented 9 years ago

Got that.

The markdown you supplied wasn't correct. Please fix it.

On 08/24/2015 07:28 PM, James Michael DuPont wrote:

so if you use markdown you can display it and edit it in github like a wiki, that is why it is good on github.

— Reply to this email directly or view it on GitHub https://github.com/IT-bestpractices/root/issues/3#issuecomment-134437450.

Alan Robertson / CTO AlanR@AssimilationSystems.com mailto:AlanR@AssimilationSystems.com/ +1 303.947.7999

Assimilation Systems Limited http://AssimilationSystems.com

Twitter https://twitter.com/ossalanr Linkedin https://www.linkedin.com/in/alanr skype https://htmlsig.com/skype?username=alanr_unix.sh

h4ck3rm1k3 commented 9 years ago

Yes, I saw the issue. Will look into it.