SuperDARNCanada / borealis

A control system for USRP based digital radars
GNU General Public License v3.0
13 stars 7 forks source link

single-channel test setup (reduced dependencies?) #401

Closed alexchartier closed 4 months ago

alexchartier commented 1 year ago

We’re trying to test some amps at APL, using a single-channel version of Borealis to supply the test signals. Do I have to get a NVIDIA GPU to build the code? If not is there some way to disable the Cuda dependency?

RemingtonRohel commented 1 year ago

The signal processing is flexible, it falls back to using numpy when cupy isn't available so you shouldn't need an NVIDIA GPU.

RemingtonRohel commented 1 year ago

What version of Borealis are you using? My previous comment only applies for Borealis v0.6+, back in v0.5 I'd have to dig around to see if you can circumvent the CUDA dependency.

alexchartier commented 1 year ago

i'm on the main - scons complains when it doesn't find the deviceQuery

alexchartier commented 1 year ago

(/usr/local/cuda/extras/demo_suite/deviceQuery not found) I'll see if I can fix that, but it seems redundant when there's no GPU in the box

RemingtonRohel commented 1 year ago

It might be easiest to edit the scons configuration. You could try commenting out line 113 of site_scons/site_config.py and lines 32-35 of tools/dsp_testing/SConscript

alexchartier commented 1 year ago

somehow it's still looking for CUDA_TOOLKIT_PATH I set it to a dummy value but wasn't enough

RemingtonRohel commented 1 year ago

Ah I think I was missing one last place. Line 27 of borealis/SConstruct, if you remove the nvcc entry from the list it should quit trying to compile .cu files.

alexchartier commented 1 year ago

Thanks. I got it built, but there's maybe a new bug in steamed_hams os.environ['PYTHON_VERSION'] seems to be gone in 3.10.6 on Ubuntu. I used: PYTHON_VERSION = platform.python_version() This seems to be an acceptable workaround - let me know if not. I think it makes sense to change that in the repo

RemingtonRohel commented 1 year ago

Ah that would be a configuration thing that we've set up. Part of the install script sets PYTHON_VERSION in the .profile file in the user home directory. You can check if it's set there, and if so you may have just needed to run source ~/.profile. If not, I'm not sure why it didn't get set, but we've taken care in the develop branch to make sure that it works smoothly in the future.

alexchartier commented 1 year ago

Ah OK - I didn't run the install script because it installs everything as root. I think that's not good for pip packages. I'll run the .profile piece though. Either way, why not get the Python version using platform and reduce the number of environment variables?

RemingtonRohel commented 1 year ago

We did change the install script to install pip packages as a specified user instead of root - but if you were avoiding installing CUDA and cupy then skipping the install script makes sense, or commenting out those installations from the bottom of the script. Our thinking was that if you wanted to have a couple different python versions then that environment variable would be the sole place you have to change to switch between them.

alexchartier commented 1 year ago

OK - where's the config.ini? I ran the install script with the "upgrade-to-v06" (to avoid installing anything) but I don't see any *ini in there


From: RemingtonRohel @.> Sent: Thursday, April 20, 2023 12:52 PM To: SuperDARNCanada/borealis @.> Cc: alexchartier @.>; Author @.> Subject: Re: [SuperDARNCanada/borealis] single-channel test setup (reduced dependencies?) (Issue #401)

We did change the install script to install pip packages as a specified user instead of root - but if you were avoiding installing CUDA and cupy then skipping the install script makes sense, or commenting out those installations from the bottom of the script. Our thinking was that if you wanted to have a couple different python versions then that environment variable would be the sole place you have to change to switch between them.

— Reply to this email directly, view it on GitHubhttps://github.com/SuperDARNCanada/borealis/issues/401#issuecomment-1516650878, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABCK6UER6M6ZKIM7JN6LTD3XCFSTJANCNFSM6AAAAAAXETAVRU. You are receiving this because you authored the thread.Message ID: @.***>

RemingtonRohel commented 1 year ago

There probably isn't one since we don't have your config file in subrepo, but it should go in borealis/borealis_config_files/xxx_config.ini where xxx is your three-letter radar code. The config format changed from v0.5 to v0.6 as well, you can check out ours to see the difference or look here.

alexchartier commented 1 year ago

That directory is empty for me - I don't know why that's different than on GitHub. Either way I'll grab one from there and edit


From: RemingtonRohel @.> Sent: Thursday, April 20, 2023 1:08 PM To: SuperDARNCanada/borealis @.> Cc: alexchartier @.>; Author @.> Subject: Re: [SuperDARNCanada/borealis] single-channel test setup (reduced dependencies?) (Issue #401)

There probably isn't one since we don't have your config file in subrepo, but it should go in borealis/borealis_config_files/xxx_config.ini where xxx is your three-letter radar code. The config format changed from v0.5 to v0.6 as well, you can check out ours to see the difference or look here.https://borealis.readthedocs.io/en/latest/config_options.html

— Reply to this email directly, view it on GitHubhttps://github.com/SuperDARNCanada/borealis/issues/401#issuecomment-1516671846, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABCK6UBEW5DRBUQFFJID7VLXCFUQJANCNFSM6AAAAAAXETAVRU. You are receiving this because you authored the thread.Message ID: @.***>

RemingtonRohel commented 1 year ago

I should be more specific - the fields that changed are:

If you prefer I can update the config file for you, all I would need is the current one you are using.

alexchartier commented 1 year ago

Thanks. Starting to deviate quite a bit from the original issue, but I have the following issues outstanding: [N200_Driver] Error: std::bad_alloc

[all others] bash: line 1: borealis_env3.10.6/bin/activate: no such file or directory

Note the second one is true - there's nothing in that /bin/ dir. I'm not sure how to debug the N200_Driver issue.

Alex

RemingtonRohel commented 1 year ago
  1. Do you have a larger traceback I could see?
  2. You will have to create this virtual environment - this is how we decided to implement supporting multiple python versions. To do so, I would first change your PYTHON_VERSION to 3.10 instead of 3.10.6, otherwise you'll probably have an issue down the road where steamed_hams tries to call python3.10.6 and throws a command-not-found error. Note you'll have to source ~/.profile again after you change it. Then you should be able to run the following commands to make the virtualenv and be good to go.
mkdir -p $BOREALISPATH/borealis_env3.10
python3.10 -m venv $BOREALISPATH/borealis_env3.10
$BOREALISPATH/borealis_env{python_version}/bin/python3 -m pip install zmq numpy scipy protobuf==3.19.4 posix_ipc git+https://github.com/SuperDARN/pyDARNio.git@develop git+https://github.com/SuperDARNCanada/backscatter.git#egg=backscatter
alexchartier commented 1 year ago

Not sure on the larger traceback for usrp_driver - that's all that was in the command window and the log . The other stuff is cleared up now

Alex


From: RemingtonRohel @.> Sent: Thursday, April 20, 2023 3:28 PM To: SuperDARNCanada/borealis @.> Cc: alexchartier @.>; Author @.> Subject: Re: [SuperDARNCanada/borealis] single-channel test setup (reduced dependencies?) (Issue #401)

  1. Do you have a larger traceback I could see?
  2. You will have to create this virtual environment - this is how we decided to implement supporting multiple python versions. To do so, I would first change your PYTHON_VERSION to 3.10 instead of 3.10.6, otherwise you'll probably have an issue down the road where steamed_hams tries to call python3.10.6 and throws a command-not-found error. Note you'll have to source ~/.profile again after you change it. Then you should be able to run the following commands to make the virtualenv and be good to go.

mkdir -p $BOREALISPATH/borealis_env3.10 python3.10 -m venv $BOREALISPATH/borealis_env3.10 $BOREALISPATH/borealis_env{python_version}/bin/python3 -m pip install zmq numpy scipy protobuf==3.19.4 posix_ipc @.*** git+https://github.com/SuperDARNCanada/backscatter.git#egg=backscatter

— Reply to this email directly, view it on GitHubhttps://github.com/SuperDARNCanada/borealis/issues/401#issuecomment-1516836708, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABCK6UCXWABKC7XBSTYKEX3XCGE7TANCNFSM6AAAAAAXETAVRU. You are receiving this because you authored the thread.Message ID: @.***>

RemingtonRohel commented 1 year ago

I think I figured it out. This has been fixed in develop, there was an issue #351 that describes why. For a workaround, set the rx_int flag to true for your connected n200 and set the interferometer_antenna field to 0. That should make it work.

RemingtonRohel commented 1 year ago

In the config file, I should add

alexchartier commented 1 year ago

Thanks! I'll try that when I get back to the lab

alexchartier commented 1 year ago

I'm getting an error: "interferometer_antennas and interferometer_antenna_count have incompatible values" Could you let me know what's wrong with the config file?

{ "site_id" : "apl", "gps_octoclock_addr" : "addr=192.168.10.3", "device_options" : "recv_frame_size=4000", "main_antenna_count" : "1", "interferometer_antenna_count" : "0", "n200s" : [ { "addr" : "192.168.10.2", "rx" : true, "tx" : true, "rx_int" : true, "main_antenna" : "0", "interferometer_antenna" : "0" } ], "main_antenna_spacing" : "15.24", "interferometer_antenna_spacing" : "15.24", "min_freq" : "8.0e6", "max_freq" : "20.0e6", "minimum_pulse_length" : "100", "minimum_tau_spacing_length" : "1", "minimum_pulse_separation" : "125", "tx_subdev" : "A:A", "max_tx_sample_rate" : "5.0e6", "main_rx_subdev" : "A:A A:B", "interferometer_rx_subdev" : "A:A A:B", "max_rx_sample_rate" : "5.0e6", "pps" : "external", "ref" : "external", "overthewire" : "sc16", "cpu" : "fc32", "gpio_bank_high" : "RXA", "gpio_bank_low" : "TXA", "atr_rx" : "0x0006", "atr_tx" : "0x0018", "atr_xx" : "0x0060", "atr_0x" : "0x0180", "lo_pwr" : "0x0600", "agc_st" : "0x1800", "tst_md" : "0x6000", "max_usrp_dac_amplitude" : "0.99", "pulse_ramp_time" : "10.0e-6", "gpio_bank" : "RXA", "tr_window_time" : "60e-6", "agc_signal_read_delay" : "0", "usrp_master_clock_rate" : "100e6", "max_output_sample_rate" : "100.0e3", "max_number_of_filtering_stages" : "6", "max_number_of_filter_taps_per_stage" : "2048", "router_address" : "tcp://127.0.0.1:6969", "realtime_address" : "tcp://eth0:9696", "radctrl_to_exphan_identity" : "RADCTRL_EXPHAN_IDEN", "radctrl_to_dsp_identity" : "RADCTRL_DSP_IDEN", "radctrl_to_driver_identity" : "RADCTRL_DRIVER_IDEN", "radctrl_to_brian_identity" : "RADCTRL_BRIAN_IDEN", "radctrl_to_dw_identity" : "RADCTRL_DW_IDEN", "driver_to_radctrl_identity" : "DRIVER_RADCTRL_IDEN", "driver_to_dsp_identity" : "DRIVER_DSP_IDEN", "driver_to_brian_identity" : "DRIVER_BRIAN_IDEN", "driver_to_mainaffinity_identity" : "DRIVER_MAINAFFINITY_IDEN", "driver_to_txaffinity_identity" : "DRIVER_TXAFFINITY_IDEN", "driver_to_rxaffinity_identity" : "DRIVER_RXAFFINITY_IDEN", "mainaffinity_to_driver_identity" : "MAINAFFINITY_DRIVER_IDEN", "txaffinity_to_driver_identity" : "TXAFFINITY_DRIVER_IDEN", "rxaffinity_to_driver_identity" : "RXAFFINITY_DRIVER_IDEN", "exphan_to_radctrl_identity" : "EXPHAN_RADCTRL_IDEN", "exphan_to_dsp_identity" : "EXPHAN_DSP_IDEN", "dsp_to_radctrl_identity" : "DSP_RADCTRL_IDEN", "dsp_to_driver_identity" : "DSP_DRIVER_IDEN", "dsp_to_exphan_identity" : "DSP_EXPHAN_IDEN", "dsp_to_dw_identity" : "DSP_DW_IDEN", "dspbegin_to_brian_identity" : "DSPBEGIN_BRIAN_IDEN", "dspend_to_brian_identity" : "DSPEND_BRIAN_IDEN", "dw_to_dsp_identity" : "DW_DSP_IDEN", "dw_to_radctrl_identity" : "DW_RADCTRL_IDEN", "dw_to_rt_identity" : "DW_RT_IDEN", "rt_to_dw_identity" : "RT_DW_IDEN", "brian_to_radctrl_identity" : "BRIAN_RADCTRL_IDEN", "brian_to_driver_identity" : "BRIAN_DRIVER_IDEN", "brian_to_dspbegin_identity" : "BRIAN_DSPBEGIN_IDEN", "brian_to_dspend_identity" : "BRIAN_DSPEND_IDEN", "ringbuffer_name": "data_ringbuffer", "ringbuffer_size_bytes" : "200e6", "data_directory" : "/home/alex/data/borealis_data", "log_directory" : "/home/alex/data/borealis_logs" }

RemingtonRohel commented 1 year ago

The interferometer_antenna_count field is set to 0, but you have one specified in the n200s list. Change the value to 1 and it should be good!

alexchartier commented 1 year ago

but it's not an interferometer...

If I set interferometer_antenna_count=1, then I have to set interferometer_antenna = 1.

I have it set like this now (just trying to generate some pulses and t/r signals to test my amps):

"site_id" : "apl",
"gps_octoclock_addr" : "addr=192.168.10.3",
"device_options" : "recv_frame_size=4000",
"main_antenna_count" : "1",
"interferometer_antenna_count" : "0",
"n200s" : [ 
    {
        "addr" : "192.168.10.2",
        "rx" : false,
        "tx" : true,
        "rx_int" : false,
        "main_antenna" : "1",
        "interferometer_antenna" : "0" 
    }
],  
alexchartier commented 1 year ago

We're back to the std :: bad alloc though

RemingtonRohel commented 1 year ago

Ah okay. I think you need to set it like this, and it will generate data files that you may or may not care about. I'll test in our lab to confirm though. The issue that I linked earlier explaining the std::bad_alloc occurs when one of the lists we generate (tx, rx, rx_int) is empty, that is, no n200s have set that field to true. In this case, you only have 1 n200, so you need to set all 3 fields to true. Also, we index from zero, so I believe that the main_antenna and interferometer_antenna fields have to be set to 0 for your n200, and the count for each set to 1.

Sorry for all the hassle about this. We should update our documentation to be more clear in this regard.

"site_id" : "apl",
"gps_octoclock_addr" : "addr=192.168.10.3",
"device_options" : "recv_frame_size=4000",
"main_antenna_count" : "1",
"interferometer_antenna_count" : "1",
"n200s" : [ 
    {
        "addr" : "192.168.10.2",
        "rx" : true,
        "tx" : true,
        "rx_int" : true,
        "main_antenna" : "0",
        "interferometer_antenna" : "0" 
    }
], 
alexchartier commented 1 year ago

that seems to have worked - thanks a lot. What is the meaning of the "main_antenna" and "interferometer_antenna" in the "n200s" section? I thought this was a "main_antenna"...

RemingtonRohel commented 1 year ago

You're welcome!

main_antenna is which physical antenna the n200 is connected to in the main array, starting from 0 for the furthest left from boresight. interferometer_antenna is the same but for the interferometer array. Since there are multiple receive channels you could have a single n200 hooked up to both a main antenna and an interferometer antenna, which we have done at our sites. If your n200 isn't connected to an interferometer antenna then it's perfectly valid to have that field blank. The main branch currently has a bug where you must have at least one interferometer antenna specified among the n200s, but that will be fixed in the next release.

If you have any suggestions for how we could make these details more clear, or any other improvements to the software, I'd really appreciate your feedback.

alexchartier commented 1 year ago

the radar has not quite started up actually - it seems to be waiting for something. The N200 driver goes through its stuff, and the experiment handler sends a new experiment from the beginning, but then nothing happens. Any ideas? Do I need to get the realtime scheduler part working properly?

RemingtonRohel commented 1 year ago

If the experiment handler is setting up a new experiment then it can't be a problem with the scheduler. The USRP driver window gets all the way to stating the time difference for each N200? I'm not sure off the top of my head what that might be, if you could zip up the logs and send them I can dig through for clues

RemingtonRohel commented 4 months ago

Not sure that there was anything actionable here. The v0.7 release handled a lot of the original issues - warning if cupy not installed, clarifying the N200 channel to antenna mapping in the config files, and fixing the problem with the radar hanging shortly after startup.