manodeep / Corrfunc

⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
https://corrfunc.readthedocs.io
MIT License
165 stars 50 forks source link

Strange output from 1 bin in DDtheta_mocks #160

Closed manodeep closed 6 years ago

manodeep commented 6 years ago

I have a single bin in a file, binfile, but the file also contains a bunch of new-lines. When I run DDtheta_mocks, the number of bins includes the total number of lines, and the output is fine for the first bin but really garbage for all next bins.

This is because of a bug in the function getnumlines in utils.c that does not strip off new-lines or white-space. The relevant comment within utils.c is:

            //WARNING: this does not remove white-space. You might
            //want to implement that (was never an issue for me)
            if(str_line[0] !=comment)

The patch would be:

-            if(str_line[0] !=comment)
+            char *c = &str_line[0];
+            while(c && isspace(*c)) {
+                c++;
+            }
+            if(c != NULL && *c != '\0' && *c !=comment) {
                 nlines++;
manodeep commented 6 years ago

@lgarrison While doing this, I noticed that this line checks for thetamin > 0.0, but we don't really divide by thetamin anywhere -- so the check should really be thetamin >= 0.0.

On a related issue, since we do allow for 0.0 as rmin, then we should include your corrections for rmin == 0.0, i.e., these lines. But none of the mocks pair-counters currently have that correction included. Do you remember if there was any reason that we didn't include the correction for the mocks pair-counters, or is that simple over-sight?

lgarrison commented 6 years ago

I guess it was just an oversight, although we never claimed in the documentation that self pairs are counted for the mocks. But for the consistency of the auto- and cross-correlation it seems like that should be the case.