MGX3D / CurrentViewer

Interactive charting for LowPowerLab CurrentRanger: capture and share GIF animations with power state profiles/swings for IoT devices
MIT License
27 stars 7 forks source link

CV with Py3.8.6 #2

Open tcsaba101 opened 3 years ago

tcsaba101 commented 3 years ago

Hi

After I have installed Visual Studio 2017 (I think C++ needed) it was starting, running on Py3.9.

Capture02

But as I touched any object on the Graphics Window it has stucked.

After Installing 3.8.6 the GW is alive for 8-10sec, then stops to show the measurings, while the program reading from serial & runs in the Command Window.

Can you tell me how to forward?

MGX3D commented 3 years ago

Can you share the current_viewer.log (if it's not too large, otherwise delete it and try to do a repro of the issue)?

From the description it almost sounds like the UI is getting stuck (I assume you did not press Pause?). It also looks like the sample speed is much lower than expected (110 vs 600-700), do you see any negative readings or erroneous floating point numbers coming from CR?

tcsaba101 commented 3 years ago

Hi,

This is the log of 3.9 and the 3.8.6. There is a bit of overlapping.in the files.

The 3.9 was continuously working and displaying the graph. All Captures made here by Snipping Tool. As I touched any buttons, it crashed.

The 3.8 stopped very immediately of start (of course the Pause is not pressed)

Thanks, Csaba

current_viewer.log current_viewer_39.log

tcsaba101 commented 3 years ago

The Captures. Capture01 Capture02 Capture03

MGX3D commented 3 years ago

At first look it seems there is a number USB data corruption errors, random characters interleaved with the sample data. Normally all the samples should be coming back in the format "1.234e-05\r\n". It would be a wild guess as to what's causing it, perhaps bad USB cable or grounding issue with the PC (are you running CR off LiPo?)

image image

tcsaba101 commented 3 years ago

The corrupted data coming from the CR, if I use Putty and log the stream, it is randomly coming. My unit sending corrupted data since I have started the USB data collection. Not many corrupted records, I made a quick stat, in that case in 100k samples less than 10 pcs. was corrupted. The appearance looks absolutely stochastic. This is a good feature in CV that handles the filtering! I have purchased a new CR, not yet arrived, I will check on that.

The cabling looks good quality, but I will change it see what happens!

The CR is on LiPo, I use optoisolator on the USB link, also opto on the DUT's terminal UART/USB converter, so ground loop is prevented. I am aware of ground loop, because many times it made crazy.

Besides the graphical visualization, the data logging (to a CSV file) would increase the value of CV. That would help the post analysis of the CR measurements.

MGX3D commented 3 years ago

I'm working on adding CSV export (and perhaps JSON too :) ). There are a few improvements in progress in regards to speed on slower CPUs (matplotlib is CPU intensive for large data sets) and those are a bit more involved, but hope to have something out in the next couple days. In the meantime, can you check if reducing the buffer size to 1000 (instead of the default 10,000) will increase the sample speed and the UI responsiveness on your machine? The parameter is here: https://github.com/MGX3D/CurrentViewer/blob/eda24dad39fee8ae4d52198455bba561af4081f5/current_viewer.py#L261

I suspect the opto-isolator might also play a role in the SPS speed, I don't have one to test with, but I'm curious what are your results with and without the optoisolator.

tcsaba101 commented 3 years ago

My logging PC is not a craft-machine: image

Decreased the buffer to 1000, it works (on the picture wind speed sensor power draw in sleep period): image

Pause was working, I stopped and annotated the current.

:)

tcsaba101 commented 3 years ago

I tried at csp=5000, was dead. At csp=2500 is looks to work.

image

image

image

tcsaba101 commented 3 years ago

Regarding to opto isolator and data errors. First I tried without the opto on separated the PC from the system ground. The result was the same.

I wait for the new CR unit what should arrive in next days, then I can check the difference.

I am really happy about the new logging feature to CSV file. The graphics window is very nice if I need to show the details for others. But more important is to analyze the happenings in 1,2,3.... days operation. I think logging is more important than instant visualization, and in a weak PC decreasing the csp, the logging still can work in the background.

About SPS: I have changed on the USB menu to "faster" sampling, but the SPS remain the same on the GW.

Of course I have more powerful PC: image

I can install on that CV to check, if you think it can help. Py 3.8 or 3.9 ?

MGX3D commented 3 years ago

Check out the most recent version (1.0.1) - it adds CSV/JSON export, as well as many improvements for visualization speed. Let me know how this works for you :)

tcsaba101 commented 3 years ago

This is perfect solution! THANKS! You addressed all future emerging options!

Runs perfectly with default parameters at buffer 100.000!

There is a logging conversion problem, see in the files. current_viewer.log data2.txt

The used options: current_viewer.py -p COM18 -g -c -vvv For the highest debug verbose -vvv is the option?

I have not checked the graphics functions yet.

MGX3D commented 3 years ago

Thanks for the nice words!

Errors like these are expected when the data is coming corrupted from hardware/USB (there is nothing CurrentViewer can do to parse it, these are not valid floating point numbers): ValueError("could not convert string to float: '43.9531.45e-6\r\n'")

CurrentView filters out the obviously invalid ones, but indeed if the floating point number appear to be in the correct representation (such is the case with "2020-11-11 16:48:35.905941,40.6") it will still export it. Obviously this value is corrupted as well, but in a different way, as CurrentRanger cannot possibly measure 40Amps :).

Do you have the latest firmware on the CR? (share the output of the '?' command) Because the data is both corrupted and also coming 6x (!) slower than I would expect.

tcsaba101 commented 3 years ago

I will analyze the source of corrupted data as I will have the new purchased CR units in my hand. The new one should work well!

The present CR unit has the latest firmware, was flashed on Nov 2.

The speed also will be examined using faster PC. As I see the speed is set up by the CR and PC internal USB service communication. Putty and Arduino Serial Monitor runs with the same speed (110SPS) while 115k baudrate has been setup!?

It will take some days to finish because I will have a 2 days program from tomorrow.

I will report asap. Thanks!

tcsaba101 commented 3 years ago

Some experiences: 48 hours of logging over 530MB file size at 110SPS, The no of records doesn't fit to an Excel page (1M lines), it is about 1.5M records. I think the most of the data logs can be omitted from the logfile without lost information about the relevant power consumption of the DUT.

Most of the data arriving are the LowPower consumption, all around 40uA (in my case), what is really not important to log each of them. We just interested the higher an lower samples. We could decrease the no of less important data content by an adaptive logging. Range logging, when the sample is out of two threshold defined, will be logged. Differential logging where the defined relative change of the sample in % tells to write it to the log or not. I have done once this kind of logger to record the current flow of a car battery, where the starter motor needed a fast logging, but other activities are not.

This kind of "data adaptive" logging would decrease IoT devices power logs file size about 98%, I suppose.

The other small thing: if I stop the CV.py program then restart and I forget to change the -o file.csv filename, it is purging the previous log content. I feel better to stop the run with the error message: file exist, or something like this. Or just append a timestamp to the filename.

tcsaba101 commented 3 years ago

I made a graph of 1M samples. There were many negative value in the file and also unrealistic values. I have filtered (it was very slow): Log_1M

Now my sampling is 5 min, normally sampling is 15, 30 min. Maybe not 98% the saving but 50% looks realistic.

tcsaba101 commented 3 years ago

A bit more testing: 50.5 hours logging was perfectly stable. The file created is 703MB. I have found a CSV splitter what chopped it to 20 pcs. of 1M line files: https://sourceforge.net/projects/splitcsv/files/split.exe/download

The analysis of the first 1M records:

image

There are 59 negative data, 79 is over 1A what are unrealistic as this DUT contains a RN2483 LoRa radio module (transmit about 40mA). Some sensors can go up to 170mA peak, but one by one, the overall consumption surely under 1 Amp. The 91% is in the range of Low Power 20-60 uA. If the logging would be done using a simple data filtering the log can be decreased by 85%. In this way the 2 days log at 110SPS could fit in 2 files (or 1 day's data in 1 file) and the singularities could be analyzed much easier.

tcsaba101 commented 3 years ago

On the 3rd day it stops logging and continuously sending the same item to the current_viewer.log file. Now I have this experience 3rd times. I don't see any correlations in file sizes or record numbers.

image

I was not able to catch the edge when the logging stops, to see the preliminary circumstances. The 1M logfile size was not enough to catch.

I have received the new CRs. One is installed, it has established a 105SPS link. The same errors in data stream. I will continue testing.

MGX3D commented 3 years ago

I added a new option --log-size in 1.0.2 that allows you to customize the log size maximum

Increased log size for debugging

> python current_viewer.py -p COM9 -vvv --log-size 128

This increases the log maximum size to 128 megabytes from the default of 1 (alteratively --log-size 0.1 would cap the log at ~100Kb). It could be useful in capturing hard to reproduce errors. Note there are two logs on disk: the current one (current_viewer.log) and the rotated one (current_viewer.log.1), so the total log on disk is actually double this parameter.

As for the issue you're seeing, I'm beginning to suspect a problem on the PC side (either USB or USB drivers). Do you have another PC to test, or perhaps use a different USB port (check the device hierarchy in Device Manager / view by connection), or remove the USB hub if you have one. Here is how it looks for me (CR USB is on COM9 and BT is one of the 12-22 ports :) ): image

MGX3D commented 3 years ago

Check out the latest version, some fixes around data corruption and exception handling. Note that now CV will abort faster in some scenarios, such as high data corruption rate, or USB serial exceptions (this also helps the logs from being rotated out and preserve the relevant errors from before - like it was happening in your case)

I did a bit more testing and the error 22 above is typically what follows a device unplug scenario. Hopefully the logs in the new version will contain the errors before error 22 (there is still some logging from other things in python, so you might still want to use the --log-size option I mentioned above).

A disconnect error will typically look like this in the log (and the chart title will update to in red):

ERROR:2020-11-24 20:38:13,448:Thread-1:Serial read error: None: (<class 'serial.serialutil.SerialException'>, SerialException("ClearCommError failed (PermissionError(13, 'The device does not recognize the command.', None, 22))"), <traceback object at 0x0000024F0E333A00>)
INFO:2020-11-24 20:38:13,448:Thread-1:Telling CR to stop USB streaming
WARNING:2020-11-24 20:38:13,448:Thread-1:Was not able to clean disconnect from the device
tcsaba101 commented 3 years ago

Installing on my fastest PC, I am here: As I understand the install is fine:

image But starting the *.py there is some glitch:

image

What to do?

MGX3D commented 3 years ago

Check out this solution (and thread): https://developercommunity.visualstudio.com/solutions/1257144/view.html

Also make sure to refresh CV to 1.0.6 - should fix the random data corruption issues (it's a bug in python's libraries as well - readline() does not appear to be thread safe). I guess remaining issue will be speed, curious how this turns out after trying the steps above

tcsaba101 commented 3 years ago

Thanks it worked.

Now I am struggling with the USB driver on the new PC install and also on the old Win10 install what worked till now. I tried to install Arduino Zero driver (https://github.com/arduino/ArduinoCore-samd/tree/master/drivers), but no success:

image

image

image

Also I have the same issue on my Win7 install.

Do you know where I find a Win10 , Win7 driver for SamD21?

MGX3D commented 3 years ago

Not sure what devices you have there, as VID_0000 PID_0002 seems invalid hardware ID, besides: 1) Windows 10 has a generic USB serial driver built in, should work without installing any drivers (and in device manager driver will say usbser.inf) 2) CR for me shows up as HW ID USB\VID_04D8&PID_EE4C&REV_0100&MI_00 (and when in flashing mode is USB\VID_04D8&PID_EE44&REV_4201&MI_00)

Perhaps delete all the strange USB drivers and start fresh (from device manager click uninstall and check the "delete the driver,,") image Then unplug it and plug it back in and make sure it gets an inbox or WU driver.

tcsaba101 commented 3 years ago

Finally I have got over all USB isues on WIn10 and Win7 too.

I am running on my DELL G3 the CV:

current1

It gets over 1000SPS for a peak sometimes. But then stops, python program dies. current2

I try to copy some files for you to find the source of the issue: current_viewer1.log Shell Window_Content.txt

Then starting in with options: -p COM18 -g -c -vvv -o data1.csv --log-size 128 I got logging speed less than 100SPS.

I have changed to -vv: image

I have got steady 1300+SPS Wow :-)) I leave it till the morning, we will see the results.

MGX3D commented 3 years ago

Nice, congrats!

The log is completely corrupted, but the Shell Window Content shows you're running CV 1.0.5 - upgrade to 1.0.6 or newer which has a new serial read method that appears to work much better against these random data corruptions.

tcsaba101 commented 3 years ago

Congrats for you! 1.0.7. works perfect. current4

The 1.0.5. version produced 1GB file in 7 hours at 1300SPS with no error. That is about 33M records. I don't know what kind of app could analyze it, Excel can handle 1M records.

CV is sampling/logging according to the set CR ADC speed (low, avg, fast) at 230SPS, 650SPS, 1300SPS.

As I discussed earlier, for long duration of LP current logging most of the uninteresting data samples could be filtered from the logfile: image

The LowPower state current data should be averaged and logged as two datapoints in the file. A first and a last point.

image

The annotated section of the incoming data could be represented in the data log file (first point) with the first measure entering the low power range (eg: 30 - 60 uA), and when the current goes out of the LP range the last logged LP data (last point) could be the average of the full annotated period. Or both could be the average of the LP time segment. This way the logfile could have a limited size with the relevant data.

Could you insert an option to handle this request?

I will test later with the slower PC the 1.0.7. version.

tcsaba101 commented 3 years ago

The 1.0.7 is running fine on the old PC. But runs at 110SPS and a lot of conversion error in runtime. You can see the errors in the shell display: SHELL_content.txt There are no errors in the log files, probably because of -vv option instead of -vvv?

The GUI is works with default options, saving also fine: image

I used the same USB hardware (cable, isolator) as with the fast PC running on 1300SPS. I tried more USB connections on the PC. It should be the USB hardware or the CPU of the PC what makes the difference?

But if I connect through a self powered USB2 hub to the same USB port, the speed goes up to 730 SPS!?

image

Looks if speed is low, more USB hw setups should be evaluated.

By the way, for dumb (not smart filtered) file logging even the 110 SPS is quite high rate. The 5 days log contains 44M records what would be 44 Excel sheet to analyze, what is impossible.

Is it possible to set the CV sampling rate to a desired value? As I know not in CR and not in CV. There is a parameter in CR of sampling: fast, avg, slow, but the actual sampling speed gives by the USB and PC config as I see. The lower sampling speed could reduce the log records size to a manageable value.

I still will test the app on a PC what is in between the two in performance.

MGX3D commented 3 years ago

It's possible but it would be hard to believe the CPU is the limiting factor here, unless it's much slower than a Raspberry Pi 4. For reference, a Raspbery Pi 400 does in excess of 2000 SPS (with firmware mods to send more data). What kind of CPU we're talking about here? (CV doesn't log the CPU speed yet).

Can you do this benchmark from a command prompt: "type COM3 >test.raw" and Ctrl-C after 30s, then divide the number of rows you got in test.row with the number of seconds it ran for?

The errors are logged always. -vvv just increases the verbosity for less severe messages. But yes you're right, those python exceptions can really slow things down. It seems there are still some data corruptions (eg newline missing), and I suspect they're coming from the python/serial libraries again, seems I haven't flushed out all of them.

I hear you on the need to reduce the data acquisition volume (and it is funny too, considering I was trying to collect as much data as possible up until now :) ), I'll think of a way to add smart filtering in next CV version but I'm also in talks with Felix as I believe this kind of feature would ideally live on the CurrentRanger: it's best not to send redundant/repeating data in the 1st place, as the USB-CDC speed is the limiting factor right now anyway. Plus it would help with other apps as well (eg arduino serial monitor).

I think the filtering method requires a bit more researching with real-world data patterns (eg, are the regular dots in your cart actual spikes in consumption or noise/interference) as to not filter out useful data.

tcsaba101 commented 3 years ago

This CPU: image This is fine for Arduino developments.

The command is not working: image

Since I have changed to the new CR unit what I have purchased from Welectron in Germany last month, things are going smoother. And I am using an external self powered D-Link USB-Hub. Plus I use the USB isolator. I have upgraded the new CV FW to 1.1.0. version, running in fast sampling mode, is better than the old CR with the same FW. There are no conversion errors anymore!

image

Data collection speed should be set for the different applications. High speed is needed to analyze high speed data or deep details in time domain. For the IoT device's peripherals debug the 1-2 kSPS logging speed is usually quite enough. CR and CV still fine for this kind of project. But when I need MCU level fast (and short) logging I use the Saleae Pro what can log at 500MSPS. https://www.saleae.com/home2/

Low speed log of IoT devices, checking the operating situations in long range on the field. The environmental IoT sensors are active for some hundred of millisecs, so 10-100SPS should be enough to detect the issue if exists.

You are absolutely right it should be filtered in CR, according to some additional setups. This way not sends unnecessary data.

I have some idea of filtering methods. I did once similar in case of car battery in a -500/20m/30 A range Starter/OFF/Charging. I used the method of not logging new record if the next sample is not changed more than +-15%. Then the next dozen of samples recorded regardless of the value. Also a viable method for IoT devices to filter out the sleep current consumption records like 20 - 60 uA range.