Closed telegraphic closed 1 year ago
Added hit_summary method to HitBrowser This update adds a new method that provides a summary of hits in the browser.
Fixed bug in get_obs() method Resolved an issue where keys not in the schema were being loaded, causing a KeyError.
Updated DB Schema with 'signal extent' column The database schema now includes a new column called 'signal extent', and the hitsearch function has been updated to populate this field.
Improved find_et pipeline for large datasets on GPUs Optimized the find_et pipeline to allow overlapping gulps when reading data from DataArray, allowing polynomial fitting + edge blanking on gulps < coarse channel size. Especially useful for Parkes UWL data (128 MHz coarse channels!)
Merging #109 (c20e05f) into master (bd0454e) will decrease coverage by
0.52%
. The diff coverage is95.54%
.:exclamation: Current head c20e05f differs from pull request most recent head 9ab2645. Consider uploading reports for the commit 9ab2645 to get more accurate results
@@ Coverage Diff @@
## master #109 +/- ##
==========================================
- Coverage 94.41% 93.89% -0.52%
==========================================
Files 30 33 +3
Lines 1790 1983 +193
==========================================
+ Hits 1690 1862 +172
- Misses 100 121 +21
Impacted Files | Coverage Δ | |
---|---|---|
hyperseti/test_data/__init__.py | 86.95% <ø> (ø) |
|
hyperseti/io/hit_db.py | 90.74% <50.00%> (-2.53%) |
:arrow_down: |
hyperseti/kernels/dedoppler.py | 89.18% <88.57%> (-10.82%) |
:arrow_down: |
hyperseti/kernels/kernel_manager.py | 94.28% <93.93%> (ø) |
|
hyperseti/kernels/peak_finder.py | 95.60% <96.55%> (-0.34%) |
:arrow_down: |
hyperseti/kernels/blank_hits.py | 97.05% <96.87%> (-2.95%) |
:arrow_down: |
hyperseti/blanking.py | 92.42% <100.00%> (+0.23%) |
:arrow_up: |
hyperseti/data_array.py | 95.10% <100.00%> (ø) |
|
hyperseti/dedoppler.py | 87.31% <100.00%> (-6.89%) |
:arrow_down: |
hyperseti/dimension_scale.py | 97.00% <100.00%> (ø) |
|
... and 12 more |
From smear_corr_kernel:
Moved dedoppler function to a new class The dedoppler function is now part of the DedopplerMan class for improved organization.
Added a base class for kernel management A new file, kernel_manager.py, has been introduced, which serves as a base class for kernel management in hyperseti/kernels/.
Switched to PeakFinderMan in peak_finder The peak_finder now uses the PeakFinderMan class instead of calling peak_find directly. This benefits overall code design.
Modified blanking hits implementation The blanking hits function now has its own loop over time samples within each beam rather than calling blank hit directly. This provides better functionality and flexibility.
Inheritance changes in PeakFinder The PeakFinder class now inherits from KernelManager, leading to a more structured and organized code.
Introduced SmearCorrMan class for smearing correction A new SmearCorrMan class has been added for handling smearing correction kernel management.
Optimized GPU memory allocation and deallocation GPU memory allocation is now managed within the workspace dictionary of each object, and a del method has been added for clean memory release after use.
Removed redundant code Some init() methods and redundant code have been removed from peak_finder and smear corr kernels, as they are now managed by their parent classes. This results in cleaner code.
Added test for smear_corr kernel A new test has been added for the smear_corr kernel to ensure its proper functioning.
Fixed frequency axis bug in dedoppler An issue with the frequency axis while using a custom plan (e.g., optimal) in dedoppler has been resolved.
Added a new kernel manager for handling hits A new kernel manager has been implemented to improve hit handling in dedoppler and peak finder kernels.
Modified dedoppler and peak finder kernels to use the new kernel manager Dedoppler and peak finder kernels now utilize the newly created kernel manager for better efficiency.
Updated tests to verify kernel manager functionality Tests have been updated to ensure kernel managers work as expected in test_dedoppler, test_peak_kernel, and test_smear_corr.
Fixed bug in hitsearch when no peaks above threshold value A bug where hitsearch was not returning hits when there were no peaks above the threshold value has been resolved.
Improved plotting code readability Changes have been made to the plotting code for better readability and understanding, but further improvements are still needed.
Stats from performance test on Parkes UWL data:
Takeaway: most time taken in merging hits, in a Pandas inbuilt query()
-- this is already a C-based routine in Pandas so hard to speed up.
Could try a FoF algorithm instead of pd.query? Maybe DBSCAN?
Loading data from disk is currently taking 20% of time. Pipelining with bifrost ('hyperfrost') is not going to cut down as much time as improving hitsearch speed.
The get_signal_extent
code in hitsearch is python only, but is not showing up as a bottleneck.
A potential python botteneck is sorting into groups - In PeakFinderMan.hitsearch
:
# Final stage: we need to make sure only one maxima within
# The minimum spacing. We loop through and assign to groups
# Then find the maximum for each group.
# TODO: Speed up this code
df = np.column_stack((np.arange(len(hits)), hits, idx_f, idx_t))
## Sort into groups
groups = []
cur = df[0]
g = [cur, ]
for row in df[1:]:
if row[2] - cur[2] < min_spacing:
g.append(row)
else:
groups.append(g)
g = [row, ]
cur = row
groups.append(g)
df = []
for g in groups:
if len(g) == 1:
df.append(g[0])
else:
mv, mi = g[0][1], 0
for i, h in enumerate(g[1:]):
if mv < h[1]:
mv, mi = h[1], i + 1
df.append(g[mi])
df = np.array(df)
Coming back from eight weeks of parental leave, and can't remember exactly where I was up to, so doing a YOLO merge and will see what breaks
WTD Summary - 25 July 2023
What The Diff was unable to process this PR. Please log in to learn more.