Closed k-perez closed 3 years ago
Hmmm. That's strange. I'm definitely not seeing that. Does this happen with the file from the PRESTO tutorial? And also, does it take a particularly long amount of time to run rfifind on your data? I have seen bad things happen with not-well-behaved input data before.
No, it is not taking a long time to run rfifind on my data, just a few minutes. And yes, this also happens with the tutorial file; I just tested it.
Can you please post the full output of rfifind after you run it on that test tutorial file?
Also, definitely make sure that there are no other .mask files in the output directory -- especially if they were written by another person so that you have permissions issues!
Here is the output. Maybe not very informative since the issue is that the .mask file does not exist. It is also not a directory issue; I've checked this too, and the other files (.inf, .stats, .rfi) are all being saved to the same directory.
Reading SIGPROC filterbank data from 1 file:
/Users/kperez/Documents/WVU_data/GBT_Lband_PSR.fil
Number of files = 1
Num of polns = 2 (summed)
Center freq (MHz) = 1400
Num of channels = 96
Sample time (s) = 7.2e-05
Spectra/subint = 2400
Total points (N) = 531000
Total time (s) = 38.232
Clipping sigma = 6.000
Invert the band? = False
Byteswap? = False
Remove zeroDM? = False
File Start Spec Samples Padding Start MJD
---- ---------- ---------- ---------- --------------------
1 0 531000 0 53010.48482638889254
Analyzing data sections of length 38400 points (2.7648 sec).
Prime factors are: 2 2 2 2 2 2 2 2 2 3 5 5
Writing mask data to GBT_Lband_PS_rfifind.mask.
Writing RFI data to GBT_Lband_PS_rfifind.rfi.
Writing statistics to GBT_Lband_PS_rfifind.stats.
Massaging the data ...
Amount Complete = 100%
mask_file /Users/kperez/Documents/WVU_data/GBT_Lband_PS_rfifind.mask
base_name GBT_Lband_PS
mask_file /Users/kperez/Documents/WVU_data/GBT_Lband_PS_rfifind.mask
time 2.0
nchans 96
tsamp 0.072
chanfrac 0.5
og_fil_file [/Users/kperez/Documents/WVU_data/GBT_Lband_PSR.fil]
Traceback (most recent call last):
File "/Users/kperez/Documents/WVU_data/Pulsar_pipelineJ1302.py", line 350, in <module>
rfi_filter(inpath, time_s, timesig, freqsig, chanfrac, intfrac, max_percent, MASKFILE, sp, outpath)
File "/Users/kperez/Documents/WVU_data/Pulsar_pipelineJ1302.py", line 228, in rfi_filter
percentage_flagged, percentage_bad_ints = rfi_quality_check.rfi_check(base_name, mask_file, time, nchans, tsamp, chanfrac)
File "/Users/kperez/Documents/WVU_data/rfi_quality_check.py", line 12, in rfi_check
a = rfifind_bandpass_on.rfifind(mask_file)
File "/Users/kperez/Documents/WVU_data/rfifind_bandpass_on.py", line 39, in __init__
self.read_mask()
File "/Users/kperez/Documents/WVU_data/rfifind_bandpass_on.py", line 58, in read_mask
x = open(self.basename+".mask")
FileNotFoundError: [Errno 2] No such file or directory: /Users/kperez/Documents/WVU_data/GBT_Lband_PS_rfifind.mask
That output definitely helps! It looks like someone has modified the rfifind code to output all of that other stuff, starting with the first mask_file
line.
The bottom output should look like:
Amount Complete = 100%
There are 20 RFI instances.
Total number of intervals in the data: 3552
Number of padded intervals: 96 ( 2.703%)
Number of good intervals: 3327 (93.666%)
Number of bad intervals: 129 ( 3.632%)
Ten most significant birdies:
# Sigma Period(ms) Freq(Hz) Number
----------------------------------------------------
1 6.67 11.5844 86.3233 66
2 6.53 11.4564 87.2878 59
3 6.52 11.52 86.8055 66
4 5.73 8.86154 112.847 1
5 5.44 11.6494 85.841 26
6 5.40 11.7818 84.8765 26
7 5.39 11.7153 85.3588 26
8 5.26 8.82383 113.329 1
9 5.03 8.74937 114.294 2
10 5.02 8.78644 113.812 2
Ten most numerous birdies:
# Number Period(ms) Freq(Hz) Sigma
----------------------------------------------------
1 230 34.56 28.9352 4.69
2 131 17.28 57.8704 4.71
3 120 17.4252 57.3881 4.70
4 66 11.5844 86.3233 6.67
5 66 11.52 86.8055 6.52
6 59 11.4564 87.2878 6.53
7 26 11.6494 85.841 5.44
8 26 11.7818 84.8765 5.40
9 26 11.7153 85.3588 5.39
10 21 8.71261 114.776 4.97
Done.
So I suspect that the code has been modified so that it accidentally doesn't write the .mask
file any longer!
You can likely check that with a git diff
, unless those changes have been committed (then you would have to git diff
with the master branch).
I'd recommend that you run rfifind directly on that file to get just its output. Maybe something like: rfifind -time 1.0 -o test GBT_Lband_PSR.fil
(which is how I ran it above)
Those other outputs were print statements I added to my pipeline to make sure everything else was working right. Sorry, should've deleted that before posting to avoid confusion. But you're right, there is something else going on that's preventing rfifind from fully running. Running the command above, I get a segmentation fault error.
Assuming the data are SIGPROC filterbank format...
Reading SIGPROC filterbank data from 1 file:
'GBT_Lband_PSR.fil'
Number of files = 1
Num of polns = 2 (summed)
Center freq (MHz) = 1400
Num of channels = 96
Sample time (s) = 7.2e-05
Spectra/subint = 2400
Total points (N) = 531000
Total time (s) = 38.232
Clipping sigma = 6.000
Invert the band? = False
Byteswap? = False
Remove zeroDM? = False
File Start Spec Samples Padding Start MJD
---- ---------- ---------- ---------- --------------------
1 0 531000 0 53010.48482638889254
Analyzing data sections of length 14400 points (1.0368 sec).
Prime factors are: 2 2 2 2 2 2 3 3 5 5
Writing mask data to 'test_rfifind.mask'.
Writing RFI data to 'test_rfifind.rfi'.
Writing statistics to 'test_rfifind.stats'.
Massaging the data ...
Amount Complete = 100%
zsh: segmentation fault rfifind -time 1.0 -o test GBT_Lband_PSR.fil
Interesting. I see where the issue might be. What C compiler are you using? And also, on what type of machine are you running?
If I send you a diff, are you able to apply it and re-compile to test?
I am using gcc version 8.5.0 (MacPorts gcc8 8.5.0_0) on macOS Catalina. All the dependencies were installed using macports as well. And yes, I should be able to do that.
OK. I actually made a new PRESTO branch. You can either view the commit that I just made and make it to your rfifind.c
file, or switch to the new branch and re-compile.
Aargh! I accidentally pushed it up to the master branch. So just go ahead and try it there. :-)
I just edited my rfifind.c file and re-compiled, and am still getting the same error
Hmmm. OK. Can you please run the command in gdb? You should just be able to do gdb rfifind
then run -time 1.0 -o test GBT_Lband_PSR.fil
and then when the segfault happens, just do a where
, and send me everything it says?
I ended up using lldb instead, and this is the output I get.
Writing mask data to 'test_rfifind.mask'.
Writing RFI data to 'test_rfifind.rfi'.
Writing statistics to 'test_rfifind.stats'.
Massaging the data ...
Amount Complete = 100%
Process 42801 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x7fff00050af2)
frame #0: 0x0000000100a29740 libgfortran.5.dylib`_gfortran_string_len_trim + 35
libgfortran.5.dylib`_gfortran_string_len_trim:
-> 0x100a29740 <+35>: cmpb $0x20, (%rsi,%rdx)
0x100a29744 <+39>: leaq -0x1(%rdx), %rax
0x100a29748 <+43>: je 0x100a2974f ; <+50>
0x100a2974a <+45>: leaq 0x1(%rdx), %rax
Target 0: (rfifind) stopped.
I'll keep trying to get gdb to work meanwhile, in case the above isn't helpful.
Yeah, that's not so useful. Can you do the equivalent of "where" from that point so it gives you a funct/line number that it was in?
Ok, I got gdb to work and this is the output. I did run make makewisdom
when installing, with no errors, and have installed twice now to double check.
Writing mask data to 'test_rfifind.mask'.
Writing RFI data to 'test_rfifind.rfi'.
Writing statistics to 'test_rfifind.stats'.
Massaging the data ...
Amount Complete = 0%Warning: Couldn't open '(null)/lib/fftw_wisdom.txt'
You should run 'makewisdom'. See $PRESTO/INSTALL.
Amount Complete = 100%
It is actually ending after the "Amount Complete = 100%"? And the code hasn't been edited at all?
It has not been edited, except for the changes to rfifind.c from yesterday
So bizarre. It is seemingly just stopping without finishing the rest of the program!
Can you run gdb and put a breakpoint on the function write_mask
and the run it and see if if gets to there? If so, then step through line by line.
It is really hard for me to debug things when I cannot replicate an error.
I put a breakpoint at write_mask
, rfifind_plot
(which is the function right before it), and a few other places to test where it is breaking, and it does not go beyond rfifind_plot
, so it is not getting to write_mask
. Here is the output. Let me know if I can do anything else to find a more informative bug!
Thread 2 hit Breakpoint 5, 0x000000010003b450 in writeinf ()
(gdb) continue
Continuing.
Analyzing data sections of length 14400 points (1.0368 sec).
Prime factors are: 2 2 2 2 2 2 3 3 5 5
Writing mask data to 'test_rfifind.mask'.
Writing RFI data to 'test_rfifind.rfi'.
Writing statistics to 'test_rfifind.stats'.
Massaging the data ...
Amount Complete = 0%Warning: Couldn't open '(null)/lib/fftw_wisdom.txt'
You should run 'makewisdom'. See $PRESTO/INSTALL.
Amount Complete = 100%
Thread 2 hit Breakpoint 7, 0x0000000100018a30 in rfifind_plot ()
(gdb) continue
Continuing.
Thread 2 received signal SIGSEGV, Segmentation fault.
0x0000000100a29740 in ?? ()
ah-ha! Now we are getting somewhere. After the segfault, can you do a "w" or "where"? to see what line it is on and in what file?
where
doesn't seem to specify any lines, but if I do a list
, I get this:
Amount Complete = 0%Warning: Couldn't open '(null)/lib/fftw_wisdom.txt'
You should run 'makewisdom'. See $PRESTO/INSTALL.
Amount Complete = 100%
Thread 2 hit Breakpoint 6, 0x0000000100018a30 in rfifind_plot ()
(gdb) continue
Continuing.
Thread 2 received signal SIGSEGV, Segmentation fault.
0x0000000100a29740 in ?? ()
(gdb) l
44 int compare_rfi_sigma(const void *ca, const void *cb);
45 int compare_rfi_numobs(const void *ca, const void *cb);
46 int read_subband_rawblocks(FILE * infiles[], int numfiles, short *subbanddata,
47 int numsamples, int *padding);
48 void get_subband(int subbandnum, float chandat[], short srawdata[], int numsamples);
49 extern int *ranges_to_ivect(char *str, int minval, int maxval, int *numvals);
50
51 /* The main program */
52
53 int main(int argc, char *argv[])
(gdb) where
#0 0x0000000100a29740 in ?? ()
#1 0x0000000100743446 in ?? ()
#2 0x00007ffeefbfe390 in ?? ()
#3 0x00007ffeefbfe38c in ?? ()
#4 0x00007ffeefbfe394 in ?? ()
#5 0x000000010074d933 in ?? ()
#6 0x2020202020202020 in ?? ()
#7 0x0000000020202020 in ?? ()
#8 0x00007ffeefbfe3c0 in ?? ()
#9 0x000000010074e197 in ?? ()
#10 0x000000010004eb62 in ?? ()
#11 0x0000000000000001 in ?? ()
#12 0x00007ffeefbfe3c0 in ?? ()
#13 0x0000000000000004 in ?? ()
#14 0x00007ffeefbfe388 in ?? ()
#15 0x000000010004eb57 in ?? ()
#16 0x00007ffeefbfe38c in ?? ()
#17 0x00000001007429c5 in ?? ()
#18 0x0000000000000001 in ?? ()
#19 0x0000000000000000 in ?? ()
Did you change the PRESTO makefile at all? It doesn't seem like there is debugging symbols compiled into rfifind. When you type "make" are you getting "-g" in each of the gcc command lines?
Nope, I did not change the Makefile, and I am getting a "-g" in each of the gcc commands
Hmmm. OK. One more question about what you pasted above. So once you get the segmentation fault and do the "where", is that all of the output? Or are there things after #19? Also, if that is all, what happens if you do "up" after the where? Each time you go "up" you should jump up out of the current loop or function to the one above. If you do enough "up"s, we might be able to figure out where we are in the PRESTO code. The reason there is no line numbers might be because we are trapped in a non-PRESTO library.
Yes, that is all of the output, and when I do "up", it just prints out each individual frame (#0 0x0000000100a29740 in ?? ()
), but the line numbers are still empty.
I've been playing with lldb, and it does seem to be a bit more informative. I added two breakpoints at write_mask
and rfifind_plot
. Here we can see that it might be a libgfortran issue?
Amount Complete = 0%Warning: Couldn't open '(null)/lib/fftw_wisdom.txt'
You should run 'makewisdom'. See $PRESTO/INSTALL.
Amount Complete = 100%
Process 7682 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x0000000100018a30 rfifind`rfifind_plot(numchan=96, numint=37, ptsperint=14400, timesigma=10, freqsigma=4, inttrigfrac=0.300000012, chantrigfrac=0.699999988, dataavg=0x0000000100d04380, datastd=0x0000000100d04580, datapow=0x0000000100d046e0, userchan=0x0000000100c09fe0, numuserchan=0, userints=0x0000000100c09f40, numuserints=0, idata=0x00007ffeefbff200, bytemask=0x0000000100d04820, oldmask=0x00007ffeefbfece0, newmask=0x00007ffeefbfed50, rfivect=0x0000000100d04970, numrfi=20, rfixwin=0, rfips=0, xwin=0) at rfifind_plot.c:78:1
75 mask * oldmask, mask * newmask,
76 rfi * rfivect, int numrfi, int rfixwin, int rfips, int xwin)
77 /* Make the beautiful multi-page rfifind plots */
-> 78 {
79 int ii, jj, ct, loops = 1;
80 float *freqs, *chans, *times, *ints;
81 float *avg_chan_avg, *std_chan_avg, *pow_chan_avg;
Target 0: (rfifind) stopped.
(lldb) thread continue
Resuming thread 0x5e2ff in process 7682
Process 7682 resuming
Process 7682 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x7fff00050af2)
frame #0: 0x0000000100a29740 libgfortran.5.dylib`_gfortran_string_len_trim + 35
libgfortran.5.dylib`_gfortran_string_len_trim:
-> 0x100a29740 <+35>: cmpb $0x20, (%rsi,%rdx)
0x100a29744 <+39>: leaq -0x1(%rdx), %rax
0x100a29748 <+43>: je 0x100a2974f ; <+50>
0x100a2974a <+45>: leaq 0x1(%rdx), %rax
Target 0: (rfifind) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x7fff00050af2)
* frame #0: 0x0000000100a29740 libgfortran.5.dylib`_gfortran_string_len_trim + 35
frame #1: 0x0000000100743446 libpgplot5.dylib`grtrim_ + 24
frame #2: 0x000000010074d933 libpgplot5.dylib`pgmtxt_ + 83
frame #3: 0x0000000100729316 libcpgplot5.dylib`cpgmtxt + 96
frame #4: 0x000000010001aa1d rfifind`rfifind_plot(numchan=96, numint=37, ptsperint=14400, timesigma=10, freqsigma=<unavailable>, inttrigfrac=<unavailable>, chantrigfrac=<unavailable>, dataavg=0x0000000100d04380, datastd=0x0000000100d04580, datapow=0x0000000100d046e0, userchan=0x0000000100c09fe0, numuserchan=1, userints=0x0000000100c09f40, numuserints=0, idata=0x00007ffeefbff200, bytemask=0x0000000100d04820, oldmask=0x00007ffeefbfece0, newmask=0x00007ffeefbfed50, rfivect=0x0000000100d04970, numrfi=20, rfixwin=0, rfips=0, xwin=0) at rfifind_plot.c:450:9
frame #5: 0x000000010004aa0d rfifind`main(argc=<unavailable>, argv=<unavailable>) at rfifind.c:468:9
frame #6: 0x00007fff739d1cc9 libdyld.dylib`start + 1
frame #7: 0x00007fff739d1cc9 libdyld.dylib`start + 1
That's good progress! Thanks! It looks like something bad is happening in a PGPLOT cpgmtxt call. So here is something to try: can you try running with "-xwin" and see if you get a plot on the screen? And if so, please take a screenshot of it. I want to carefully check to see if all the labels and text look OK.
cpgmtxt is known to cause segfaults with certain gcc versions: https://trac.macports.org/ticket/57726
I do not get a plot on the screen, but I am using gcc8 so maybe that is the issue? Let me try re-installing with gcc7
good grief. That certainly seems like the issue!
Yes, that was the issue! Thanks so much for helping! For the record, I had pgplot for gcc11. Installing pgplot for gcc7 fixed it.
Great! Glad you are set now! It is pretty sad that PGPLOT is unmaintained but still so useful!
Hi, I installed the new v4 PRESTO on my Desktop, and when running the rfifind command within my pipeline, it runs smoothly, and I get the
Writing mask data
andAmount Complete = 100%
output. However, the .mask file does not get saved. Other files such as .rfi and .stats do get saved as expected. Any idea on what could be happening? There are no errors reported. Thanks!