rapidsai / clx

A collection of RAPIDS examples for security analysts, data scientists, and engineers to quickly get started applying RAPIDS and GPU acceleration to real-world cybersecurity use cases.
Apache License 2.0
168 stars 68 forks source link

[REVIEW] Update notebooks to use the new Tokenizer #430

Closed VibhuJawa closed 3 years ago

VibhuJawa commented 3 years ago

This PR closes https://github.com/rapidsai/clx/issues/418.

Files to change:

review-notebook-app[bot] commented 3 years ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

VibhuJawa commented 3 years ago

The CI failure on the PR looks independent to the PR. In a fresh environment below fails:

(rapids) root@37bcf6ec0c27:/clx/python/clx# pytest tests/test_windows_event_parser.py::test_windows_event_parser
==================================================================================================== test session starts ====================================================================================================
platform linux -- Python 3.7.10, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /clx/python, configfile: pytest.ini
plugins: anyio-3.2.0
collected 1 item                                                                                                                                                                                                            
.....
.....
.....
.....
.....
.....
.....

>       assert parsed_rec["detailed_authentication_information_logon_process"] == "ntlmssp"
E       AssertionError: assert 'ntlmssp ' == 'ntlmssp'
E         - ntlmssp
E         + ntlmssp 
E         ?        +

tests/test_windows_event_parser.py:136: AssertionError
efajardo-nv commented 3 years ago

The errors from windows_event_parser were caused by upstream changes to cuDF strings extract. Should be fixed with this PR.

VibhuJawa commented 3 years ago

Thanks for linking that @efajardo-nv . The PR is ready for a review then. :-D

VibhuJawa commented 3 years ago

@gpucibot rerun tests

efajardo-nv commented 3 years ago

rerun tests

efajardo-nv commented 3 years ago

@gpucibot merge