open-iscsi / open-isns

iSNS server and client for Linux
GNU Lesser General Public License v2.1
26 stars 22 forks source link

The open-isns tests do not all pass! #9

Closed gonzoleeman closed 7 years ago

gonzoleeman commented 8 years ago

As part of some recent debugging on open-isns I figured out how to run the self tests in the source tree and then ran them. Several of them fail!

These need to be addressed, but I do not have the time right now, so this issue will record this need.

nacc commented 7 years ago

Is it just me or do the tests expect the db dumps to be in index order? I have a patch locally that does it grossly in a post-process loop.

nacc commented 7 years ago

Actually, it's even worse, some tests are sorted and some aren't. It seems incredibly unreliable to rely on that?

gonzoleeman commented 7 years ago

I need to look at the code. My initial thought is to wonder why the data is not stored in sorted order.

nacc commented 7 years ago

Yep, it seems there are some testcases that do add / removals and I'm not sure why it matters (or maybe it does for a different reason) that the database dump is in the exact order that is specified in the sample comparison and since the comparison is done element-to-element in-order, if (for any reason) the database dump will fail even if the contents are the same.

I'm seeing this on Ubuntu 17.10, where I tried to run the self-tests during the build of the package (I am trying to get it included in Ubuntu main so that open-iscsi can leverage it (a Debian change recently introduced the dependency).

Changing the reference files to be in order did allow most of them to pass. I think, though, some tests actually require isnsd to be installed to run? Since I was doing a build-time test, that's not the case, so I'm just trying to understand.

gonzoleeman commented 7 years ago

Personally I need to study this more, but I don't think the order of the DB should matter. Or, if it does matter, perhaps it should be kept in sorted order so we always get the same result from the same operations, regardless of order?

nacc commented 7 years ago

@gonzoleeman agreed, it seems like it shouldn't be necessary. Let me send a PR in a bit with my changes locally.

nacc commented 7 years ago

The following dumb patch:

--- a/db.c
+++ b/db.c
@@ -929,7 +929,7 @@
 isns_db_print(isns_db_t *db, isns_print_fn_t *fn)
 {
        const isns_object_list_t *list = db->id_objects;
-       unsigned int    i;
+       unsigned int    i, j;

        fn("Dumping database contents\n"
           "Backend:     %s\n"
@@ -940,22 +940,26 @@
           db->id_last_eid,
           db->id_last_index);

-       for (i = 0; i < list->iol_count; ++i) {
-               isns_object_t *obj = list->iol_data[i];
-
-               fn("--------------\n"
-                  "Object:      index=%u type=<%s> state=%s",
-                  obj->ie_index,
-                  obj->ie_template->iot_name,
-                  isns_object_state_string(obj->ie_state));
-               if (obj->ie_container)
-                       fn(" parent=%u", obj->ie_container->ie_index);
-               if (obj->ie_flags & ISNS_OBJECT_DIRTY)
-                       fn(" DIRTY");
-               if (obj->ie_flags & ISNS_OBJECT_PRIVATE)
-                       fn(" PRIVATE");
-               fn("\n");
-               isns_attr_list_print(&obj->ie_attrs, fn);
+       for (i = 0; i < db->id_last_index; ++i) {
+               for (j = 0; j < list->iol_count; ++j) {
+                       isns_object_t *obj = list->iol_data[j];
+                       if (obj->ie_index != i) {
+                               continue;
+                       }
+                       fn("--------------\n"
+                          "Object:      index=%u type=<%s> state=%s",
+                          obj->ie_index,
+                          obj->ie_template->iot_name,
+                          isns_object_state_string(obj->ie_state));
+                       if (obj->ie_container)
+                               fn(" parent=%u", obj->ie_container->ie_index);
+                       if (obj->ie_flags & ISNS_OBJECT_DIRTY)
+                               fn(" DIRTY");
+                       if (obj->ie_flags & ISNS_OBJECT_PRIVATE)
+                               fn(" PRIVATE");
+                       fn("\n");
+                       isns_attr_list_print(&obj->ie_attrs, fn);
+               }
        }
 }

results in most tests passing in 17.10: http://paste.ubuntu.com/24859507/ I'm working through the ones that fail now.

nacc commented 7 years ago

Hrm, in tests/test06.pl, I think the ":iscsi" is wrong? I see some code in isns to handle this, but the socket parsing code (isns_portal_parse and friends) only handles ":isns" as a special case, and isnsadm (which test06.pl ends up invoking) calls that.

test06 seems to have other issues, too, but updating it with the previous patch + sorting the tests/data/test06 results and fixing the port to be 860 (I think the intention was for it to be derived as the initiator port by isnsadm, but I see no code to do this?), test06 now passes locally: http://paste.ubuntu.com/24860355/

The failing tests I see on 17.10 are only test02: http://paste.ubuntu.com/24860375/, which appears to be some sort of data duplication issue and test11: http://paste.ubuntu.com/24860381/. Any suggestions for debugging either will be greatly appreciated!

nacc commented 7 years ago

@gonzoleeman I just dumped 4 PRs that I think get us much closer. One is pretty clearly a bugfix (although I may be misunderstanding the logic completely) in the Discovery Domain code.

With these changes, I get all but test11 to pass in as-installed testing of 0.97 in Ubuntu 17.04! I am having a helluva time even understanding what test11 is trying to do -- does it actually need network connectivity to microsoft?

gonzoleeman commented 7 years ago

I will try to run the test suite soon, and perhaps I can close this issue, thanks to your submissions.

gonzoleeman commented 7 years ago

FWIW I am looking at this again, now that I have a bit of spare time, and I concur that only test11.pl is not working. I'm dropping back and re-reading the spec before I try to diagnose what that test is doing.

gonzoleeman commented 7 years ago

Ok, I found two issues with test11.pl. And no, it is not trying to connect to Microsoft. I've actually added comments to the program pauw4.c, which is the basis for test11.pl, describing what it is testing. There were two issues fixed, though I did some cleanup of the test code, as well. I'm marking this issue closed.

nacc commented 7 years ago

Nice work Lee!