dmwm / DAS

Data Aggregation System
11 stars 7 forks source link

Code audit: DAS #290

Closed ghost closed 12 years ago

ghost commented 14 years ago

Done. The %post section has been reviewed and cleaned up.

vkuznet commented 14 years ago

valya: Lassi, thank you for comprehensive feedback. I'm working on all topics you raise, but want to clarify a few of them:

'''Lat comment:''' Would also suggest to use python configuration file, not .cfg style. '''VK reply:''' I support both configuration styles: via python configuration and cfg. The former is used for DAS CMS configuration, while cfg can be used for DAS used elsewhere apart from CMS environment (see below).

'''Lat comment:''' xwho is obsolete and probably should be converted to use xldap.cern.ch. '''VK reply:''' The services such as xwho, ipservice, etc. are used for DAS tests. They're not activated in production, but pretty useful for testing/development of generic DAS behavior. The service activation is done via das_cms.py configuration where all serviced are listed (aka whitelist).

'''Lat comment:''' DAS seems pretty liberal on using various ports, especially for MongoDB. We should ensure DAS uses port range allocated to it. '''VK reply:''' I don't really understand this statement. What sort of liberty we're talking about? MongoDB port usage is configured via das_cms.py configuration file, see config.mongodb.dbport = 27017. All connections are done through one port. The port number is not hardcoded in a code. Please elaborate more.

'''Lat comment:''' das_top.tmpl pulls yui from yahoo's servers - seems like das should serve it itself. '''VK reply:''' My understanding that a new service StaticScruncher should take care of that. Right now it is not in production, therefore I use Yahoo to server YUI files. Also I intentionally would like to keep DAS code pretty generic and portable outside of CMS environment, see below.

'''Lat comment:''' It seems there is very little use of WMCore. I suggest to reconsider reusing at least the base server, possibly quite a bit more from WMCore now that it is available - I appreciate it might not have been in place when this code got started. '''VK reply:''' The usage of WMCore was minimized on purpose due to significant interest in DAS for other scientific domains apart from CMS. This has been discussed with L2 managers up-front. Since none of the bits of DAS code is really require WMCore, including web base server (the DAS web server depends on CherryPy rather on WMCore), I would like to keep DAS core very generic to allow its portability to non-CMS environments.

vkuznet commented 14 years ago

valya: Replying to [comment:18 lat]:

Review note on DAS: curious about this coding pattern:

{{{

!python

try: jsondict = json.loads(data) except: jsondict = eval(data) }}}

Use of {{{eval()}}}, especially without any dictionary restrictions - eval global and local dictionary options - seems very dangerous. It would seem that if someone manages to compromise some other CMS application, they can also contaminate DAS because it will eval the data produced by those sources. Admittedly I am not sure where the "data" can originate here. But if the data is already sanitised, the construct seems unnecessary, so I am assuming it originates from untrusted outside source.

What's the reason falling back on eval? Is the data python-esque json, but not completely legal json? Can we not fix the upstream data source? The above example was from SiteDB, so for that we should just fix the output...

Yes some data services which returns JSON, e.g. overview, is not fully parseable, that's was a reason to have eval. I totally agree that usage of eval is not a best solution is better to fix the upcoming source. But this is sort of chicken and egg problem. If I put strict json.loads, then some services/data will not be accessible and it is reduce value of DAS per-se. So I would rely on L2 decision about fixing DAS code to use json.loads or make a gradual approach to keep eval around until we fix incoming data-service's data.

For your convenience I'm attaching test_monitor.py script which fails in json.load for overview service. Feel free to investigate it more.

ghost commented 14 years ago

lat: Comments [comment:15 15] and [comment:55 55]: xwho is deprecated and will go away, so even as test may stop being useful.

ghost commented 14 years ago

lat: Comments [comment:26 26] and [comment:55 55]: Apologies, partly a misunderstanding here. Indeed there are a ton of ports in use - I counted 424 on vocms53 - but they are mostly ESTABLISHED, not LISTEN, so non-bound randomly assigned ports by the operating system. So it looks like DAS just creates a large number of ports to talk with MongoDB. But please do move !MongoDB to the DAS assigned port range. See [https://cms-http-group.web.cern.ch/cms-http-group/activity.html#info assigned port ranges]. Note though default ulimits on the servers are 1024 file descriptors. You'll want to make sure DAS retains enough available descriptors to talk to clients after it's created sockets to talk to !MongoDB.

ghost commented 14 years ago

lat: Comments [comment:27 27] and [comment:55 55]: We have had servers deliver internally deployed YUI installation for years. I would suggest this is a case of just making it happen. New services really shouldn't regress in this basic a matter. I can provide example code for bundling minimised collapsed YUI, ExtJS and other resources - javascript, css, sprites, etc. - from DQM GUI / Overview, it should take no more than 30 minutes to adapt it for DAS.

ghost commented 14 years ago

lat: Comments [comment:44 44] plus many others, and [comment:55 55]: Personally my feedback is that this is mission creep.

Basically you would effectively be saying that the one concrete client that funds the development work, CMS, is getting an inferior product because there is unquantified interest from hypothetical other clients. The dependencies on WMCore base server aren't that big, and can certainly be slimmed if necessary. Many issues in deployment would have been avoided if the base server had been used.

I would suggest merely discussing the matter with L2s is not enough. I would suggest we need a statement from the L2s that they actively support this choice. If the latter materialises we will need to redraft the CMSWEB SLA as this is a pretty significant change in direction.

If concrete interest beyond corridor discussions does manifest I'd be quite happy to look at making sure DAS is general and reusable outside CMS. But unless it actually has outside people funded to work on it upfront, I would suggest it needs to be primarily and actively focused on addressing the priorities of its sole client, CMS.

drsm79 commented 14 years ago

metson: Replying to [comment:60 lat]:

Comments [comment:44 44] plus many others, and [comment:55 55]: Personally my feedback is that this is mission creep.

Basically you would effectively be saying that the one concrete client that funds the development work, CMS, is getting an inferior product because there is unquantified interest from hypothetical other clients. The dependencies on WMCore base server aren't that big, and can certainly be slimmed if necessary. Many issues in deployment would have been avoided if the base server had been used.

I don't really recall discussions where not using the WMCore stuff was agreed on - I remember there being features DAS needed that weren't in WMCore that now are. I was surprised by how little DAS uses the WMCore pieces for the web interface related stuff, and think that moving to a WMCore base should be part of the next major release. Using WMCore doesn't preclude it's use by other interested parties, since the WMCore code is open source (though don't ask me what license...).

ghost commented 14 years ago

lat: Comments [comment:18 18] and [comment:56 56]: Thanks for the test case. I suggest to actively report any such issues when you run into them. Most of the data sources can be fixed pretty quickly. Could you please file the particular one for overview in savannah, iguana project, and I'll see it gets fixed?

Where you do find yourself stuck having to use eval from outside sources, you should do so with a restricted dictionary to prevent access to outside program with {{{ eval(x, { "builtins": None }, {}) }}}

I'd like to see at least the latter fix as soon as possible. Fixing upstream sources I can't say much about until we have a concrete list of bugs/tickets for the specific issues and are able to determine how much work they are to fix.

vkuznet commented 14 years ago

valya: Replying to [comment:61 metson]:

Replying to [comment:60 lat]:

Comments [comment:44 44] plus many others, and [comment:55 55]: Personally my feedback is that this is mission creep.

Basically you would effectively be saying that the one concrete client that funds the development work, CMS, is getting an inferior product because there is unquantified interest from hypothetical other clients. The dependencies on WMCore base server aren't that big, and can certainly be slimmed if necessary. Many issues in deployment would have been avoided if the base server had been used.

I don't really recall discussions where not using the WMCore stuff was agreed on - I remember there being features DAS needed that weren't in WMCore that now are. I was surprised by how little DAS uses the WMCore pieces for the web interface related stuff, and think that moving to a WMCore base should be part of the next major release. Using WMCore doesn't preclude it's use by other interested parties, since the WMCore code is open source (though don't ask me what license...).

Ok, code will be migrated in next release, see https://svnweb.cern.ch/trac/CMSDMWM/ticket/518

vkuznet commented 14 years ago

valya: Replying to [comment:62 lat]:

Comments [comment:18 18] and [comment:56 56]: Thanks for the test case. I suggest to actively report any such issues when you run into them. Most of the data sources can be fixed pretty quickly. Could you please file the particular one for overview in savannah, iguana project, and I'll see it gets fixed?

Where you do find yourself stuck having to use eval from outside sources, you should do so with a restricted dictionary to prevent access to outside program with {{{ eval(x, { "builtins": None }, {}) }}}

I'd like to see at least the latter fix as soon as possible. Fixing upstream sources I can't say much about until we have a concrete list of bugs/tickets for the specific issues and are able to determine how much work they are to fix.

I applied {{{ eval(x, { "builtins": None }, {}) }}} in a code. While I unable to find overview in savannah. I don't have a clue how it is called and search results for '''overview''' does not yield anything. SiteDB also has this problem, the ticket is https://svnweb.cern.ch/trac/CMSDMWM/ticket/523

ghost commented 14 years ago

lat: Replying to comment [comment:64 64]: Thanks. The comment you quoted mentioned "Could you please file the particular one for overview in savannah, iguana project, and I'll see it gets fixed?". Let me know if you still can't find the project and I'll post a direct bug report link.

vkuznet commented 14 years ago

valya: I found IGUANA savannah and submitted ticket over there, https://savannah.cern.ch/bugs/index.php?73620

vkuznet commented 14 years ago

valya: Replying to [comment:58 lat]:

Comments [comment:26 26] and [comment:55 55]: Apologies, partly a misunderstanding here. Indeed there are a ton of ports in use - I counted 424 on vocms53 - but they are mostly ESTABLISHED, not LISTEN, so non-bound randomly assigned ports by the operating system. So it looks like DAS just creates a large number of ports to talk with MongoDB. But please do move !MongoDB to the DAS assigned port range. See [https://cms-http-group.web.cern.ch/cms-http-group/activity.html#info assigned port ranges]. Note though default ulimits on the servers are 1024 file descriptors. You'll want to make sure DAS retains enough available descriptors to talk to clients after it's created sockets to talk to !MongoDB.

I want to clarify this a little bit. I can reassign MongoDB port to DAS range. But I think we need separate port range slot for MongoDB itself. It can be used elsewhere apart from DAS and then it may be a port conflict between services. It is database back-end. By default Mongo uses 27017, 27018 ports. Another point if we will use sharding and spread Mongo across the nodes. In later case a separate port slot would be more convenient. But, I don't mind and will follow whatever you decide, just want to raise a point.

drsm79 commented 14 years ago

metson: I agree with Valentin, MongoDB should be given it's own port range (same as CouchDB has 5984), however I'd also like to understand why there are 424 open ports if it's using the two default ports (27017, 27018)...

ghost commented 14 years ago

lat: Regarding [comment:67 ports], there's no problem assigning another port range to MongoDB, but I'd rather not use ports above 10000. We still get machines reallocated from "special" netblocks within CERN, with pass-through permissions for ports in 10000-30000 range.

We've had servers supposedly protected by the global firewall accessed from out in the wild in these high ports, just because the machine was some grid server in some previous life before being reallocated to us, and the global firewall had some huge gaping hole for the netblock it was in.

For example port range 8230 - 8239, just above DAS, is currently not reserved for anything. Can we relocate MongoDB there?

vkuznet commented 14 years ago

valya: I can use this port slot, no problem.

ghost commented 14 years ago

lat: Regarding [comment:68 open connections], they are connected sockets, i.e sockets between DAS and MongoDB. Just ssh to cmsweb@vocms53.cern.ch and run {{{netstat -tanlp | grep ESTABLISHED | grep 27017}}} to see them. We have currently:

{{{ $ netstat -tanlp | grep ESTABLISHED | grep 27017 | awk '{print $NF}' | sort | uniq -c 212 4500/mongod 138 4860/python 74 4875/python }}}

Why there are that many I can't answer. Maybe every DAS thread creates some number of connections? Note that half of the sockets are for python side, the other half is the mongod side, as shown above.

vkuznet commented 14 years ago

valya: Replying to [comment:69 lat]:

Regarding [comment:67 ports], there's no problem assigning another port range to MongoDB, but I'd rather not use ports above 10000. We still get machines reallocated from "special" netblocks within CERN, with pass-through permissions for ports in 10000-30000 range.

We've had servers supposedly protected by the global firewall accessed from out in the wild in these high ports, just because the machine was some grid server in some previous life before being reallocated to us, and the global firewall had some huge gaping hole for the netblock it was in.

For example port range 8230 - 8239, just above DAS, is currently not reserved for anything. Can we relocate MongoDB there?

Please open new tickets for new features defects, rather keep a monster thread. This task has been assigned to ticket #528

vkuznet commented 14 years ago

valya: Replying to [comment:70 valya]:

I can use this port slot, no problem.

This will be tracked separately, see #529

vkuznet commented 14 years ago

valya: I walk through all of the issues shown originally in this ticket. The attached patch address issues with bin area, input parameter validation, JSON vs eval, templates, URL quoting., wrap2das, etc. It also fix the following tickets: #493, #494, #446, #449, #447

All other issues are relocated to stand-alone tickets: #528, #529, #523, #398, #452

vkuznet commented 14 years ago

valya: (In 29f1b00a333243e50c51b3f6fe7ee4b759437fd9) Work on code based on Code Audit, fixes #290, #493, #494, #446, #449, #447

Signed-off-by: Valentin Kuznetsov vkuznet@gmail.com

ghost commented 14 years ago

lat: There's a few issues with the patch, so I am re-opening the ticket. If you want to split these off to new tickets, that's fine, but I thought the comments belong here. Am adding them as individual comments for easier reference.

ghost commented 14 years ago

lat: Here in das_map, you should take $dir as command line argument, not make checks on host name:

{{{

!diff

+if [ hostname -d == "cern.ch" ]

ghost commented 14 years ago

lat: Comment [comment:12 12] wasn't really answered, the {{{dassh}}} was just removed. Is ipython dependency completely removed now, in RPM rules as well, and ipython is completely irrelevant to DAS?

ghost commented 14 years ago

lat: Comment [comment:10 10] still needs addressing. Will we see init scripts folded into DAS 'manage'?

ghost commented 14 years ago

lat: Comment [comment:18 18] follow-up: thanks for switching to the new {{{eval()}}} scheme. Though I would note that the {{{eval()}}} isn't inside {{{try ... except}}} - it probably should be. Or are you deliberately intending it to raise an exception?

ghost commented 14 years ago

lat: Comment [comment:21 21] follow-up: I'd still find {{{isinstance}}} more readable. (Cf. PEP 8.)

ghost commented 14 years ago

lat: Comment [comment:23 23] follow-up: I guess it wasn't clear enough, but "session arguments" meant "session" and "version". All you need is the actual data arguments. Also would prefer they were deleted, not just commented out.

ghost commented 14 years ago

lat: I didn't understand the addition of urllib quoting in, for example, das_table.tmpl. Shouldn't you use encodeURIComponent in javascript code / arguments, and urllib when quoting something originating from DAS server itself? To me it seems you are now sometimes quoting javascript itself, not the javascript variable value.

ghost commented 14 years ago

lat: Also I note here that the quoting wasn't added universally everywhere - not in all templates, and not even systematically in the one example I happened to quote, das_table.tmpl. As I wrote before, it looks like every template needs to be sanitised. I can't easily tell which values are safe.

ghost commented 14 years ago

lat: I would suggest to remove code, not comment it out. There can be exceptions, but I'd like to see most of the commented out code removed. There's version management history if you need to go back; the history doesn't need to be in the code itself.

ghost commented 14 years ago

lat: Thank you for adding {{{checkargs}}} to verify parameters. It has a few flaws I'd like to see fixed:

ghost commented 14 years ago

lat: Quite a number of comments above had no response or follow-up, and were not addressed in the patch or new tickets created. Should I assume closing the ticket means the comments are dismissed?

ghost commented 14 years ago

lat: Forgot to mention that the logger change in json_parser() seems to have a bug as it calls logger.warining, not .warning.

vkuznet commented 14 years ago

valya: Replying to [comment:76 lat]:

There's a few issues with the patch, so I am re-opening the ticket. If you want to split these off to new tickets, that's fine, but I thought the comments belong here. Am adding them as individual comments for easier reference.

Lassi I prefer individual tickets, so I'll walk through your comments and create them. The closing of this one was done automatically when patch was committed. So it is not intentional and neither means dismissing issues.

vkuznet commented 14 years ago

valya: Replying to [comment:78 lat]:

Comment [comment:12 12] wasn't really answered, the {{{dassh}}} was just removed. Is ipython dependency completely removed now, in RPM rules as well, and ipython is completely irrelevant to DAS?

ipython is not part of DAS. I used to manage mongo. So we can safely remove this dependency. This should be done in das.spec file and therefore doesn't reflect in a patch.

vkuznet commented 14 years ago

valya: Replying to [comment:80 lat]:

Comment [comment:18 18] follow-up: thanks for switching to the new {{{eval()}}} scheme. Though I would note that the {{{eval()}}} isn't inside {{{try ... except}}} - it probably should be. Or are you deliberately intending it to raise an exception?

Yes, I want to get exception if eval is not succeed. I don't know how to handle input data if I can't parse it and throwing exception is a best way to address the unparseable data.

vkuznet commented 14 years ago

valya: Replying to [comment:81 lat]:

Comment [comment:21 21] follow-up: I'd still find {{{isinstance}}} more readable. (Cf. PEP 8.)

Open new ticket, https://svnweb.cern.ch/trac/CMSDMWM/ticket/541

vkuznet commented 14 years ago

valya: Replying to [comment:86 lat]:

Thank you for adding {{{checkargs}}} to verify parameters. It has a few flaws I'd like to see fixed:

  • You don't use what you verify. Some arguments are casted to strings (str(x)) before checking. You should instead verify what you will use.
  • You should type check all arguments for reasons above. A keyword argument can be None (not given), a string (given once), or a list (if given several times).
  • Contents of many, but not all arguments are checked. I didn't see any additional checking added for remaining arguments elsewhere so it looks like several vulnerabilities remain. You should always sanitise all arguments. Even if the argument is free form input, you can often make sure it only consists of certain legitimate characters (e.g. letters only).
  • Failure to verify arguments should raise an exception.
  • Failure to check an argument should not return the argument value back to caller. This is unsafe; you don't know what the value contains, and you just determined it's not valid. Returning the value to caller can be used to create XSS and other attacks. My general preference is to never return anything to the caller - you simply return suitable HTTP status code.
  • It's not sanitising the HTTP method; note that 'method' keyword argument is not the same as the request method!

See new ticket #542

vkuznet commented 14 years ago

valya: Replying to [comment:87 lat]:

Quite a number of comments above had no response or follow-up, and were not addressed in the patch or new tickets created. Should I assume closing the ticket means the comments are dismissed?

Lassi, I would prefer individual tickets, since it's much easier to trac than monolitic thread where you can easily lost not because of dismissing issue, but just unintentionally pass it. Feel free to open tickets for those comments which are not addressed. Having individual tickets also simplify code patching/management.

vkuznet commented 14 years ago

valya: Replying to [comment:88 lat]:

Forgot to mention that the logger change in json_parser() seems to have a bug as it calls logger.warining, not .warning.

Thanks for catching this typo. It's fixed now.

vkuznet commented 14 years ago

valya: Replying to [comment:79 lat]:

Comment [comment:10 10] still needs addressing. Will we see init scripts folded into DAS 'manage'?

Working on it, now in separate ticket #543

vkuznet commented 14 years ago

valya: Replying to [comment:82 lat]:

Comment [comment:23 23] follow-up: I guess it wasn't clear enough, but "session arguments" meant "session" and "version". All you need is the actual data arguments. Also would prefer they were deleted, not just commented out.

Now under #544

vkuznet commented 14 years ago

valya: Replying to [comment:84 lat]:

Also I note here that the quoting wasn't added universally everywhere - not in all templates, and not even systematically in the one example I happened to quote, das_table.tmpl. As I wrote before, it looks like every template needs to be sanitised. I can't easily tell which values are safe.

This and previous comment now under separate ticket, #545

vkuznet commented 14 years ago

valya: Replying to [comment:85 lat]:

I would suggest to remove code, not comment it out. There can be exceptions, but I'd like to see most of the commented out code removed. There's version management history if you need to go back; the history doesn't need to be in the code itself.

Now under #546

vkuznet commented 14 years ago

valya: Replying to [comment:76 lat]:

There's a few issues with the patch, so I am re-opening the ticket. If you want to split these off to new tickets, that's fine, but I thought the comments belong here. Am adding them as individual comments for easier reference.

Lassi please review this one more time. I reassigned all your comments to separate tickets. If I miss something please fire up new ticket. We can keep this ticket open as you wish until all other ones will be addressed. Since we put this into DAS-Commissioning milestone the milestone can be completed once all tickets assigned to it will be closed.

ghost commented 14 years ago

lat: I'd prefer not to go through the comments one by one myself. I'd suggest you do that. Just start from the top and create a ticket for every item that wasn't yet covered.

vkuznet commented 13 years ago

valya: Replying to [comment:15 lat]:

Review note on DAS: xwho is obsolete and probably should be converted to use xldap.cern.ch. See {{{phonebook}}} command on any CERN linux system. It's just a perl script which makes a simple LDAP query to xldap.cern.ch. Note that xldap.cern.ch is only available within CERN, (testing) outside CERN you'll need (SSL + password authenticated) connection to ldap.cern.ch instead.

All YML files will move into SITECONF, see ticket #546. Xwho will not be present over there.

vkuznet commented 13 years ago

valya: Replying to [comment:77 lat]:

Here in das_map, you should take $dir as command line argument, not make checks on host name:

{{{

!diff

+if [ hostname -d == "cern.ch" ]

  • dir=/data/projects/das/config/maps +else
  • dir=$DAS_ROOT/src/python/DAS/services/maps +fi }}}

Ticket #550

vkuznet commented 13 years ago

valya: Replying to [comment:16 lat]:

Review note on DAS: would suggest to generally consider slight flattening of name space. "from DAS.utils.utils import foo" seems a little redundant :-)

I don't think it is relevant, it's a matter of preferences. DAS code structure was established based on WMCore rules. WMCORE has similar long namespaces. I'll skip it, since it involves a large redesign of dir structure, changing all the names, etc.