Open lindonm opened 1 month ago
@lindonm unfortunately I don't have Splunk Cloud, however testing with the versions of the other apps you listed I cannot recreate the issue so far. I am wondering if there exists any strange characters in the events it is maybe choking on. Does this only occur with sourcetype="o365:management:activity"
? Can you try a search like this with the built-in lookups to create a result of over 10000 events?
| inputlookup moby_dick.csv
| append
[ inputlookup peter_pan.csv]
| cleantext textfield=sentence keep_orig=true base_word=true remove_stopwords=false force_nltk_tokenize=true base_type="lemma_pos" term_min_len=1 ngram_mix=false
Thanks @geekusa ,
Ran that query and it succeeds with no errors in the UI
This search has completed and has returned 12,750 results by scanning 0 events in 25.786 seconds
The following messages were returned by the search subsystem:
info : [subsearch]: Successfully read lookup file '/opt/splunk/etc/apps/nlp-text-analytics/lookups/peter_pan.csv'.
I did note some of the same/similar errors in the search log, so maybe those are a red herring.
0-31-2024 23:21:59.314 INFO SearchParser [3300457 searchOrchestrator] - PARSING: | inputlookup moby_dick.csv\n| append \n [ inputlookup peter_pan.csv]\n| cleantext textfield=sentence keep_orig=true base_word=true remove_stopwords=false force_nltk_tokenize=true base_type="lemma_pos" term_min_len=1 ngram_mix=false
10-31-2024 23:21:59.318 INFO ServerConfig [3300457 searchOrchestrator] - Will add app jailing prefix /opt/splunk/bin/nsjail-wrapper for nlp-text-analytics
10-31-2024 23:21:59.318 INFO ChunkedExternProcessor [3300457 searchOrchestrator] - Running process: /opt/splunk/bin/nsjail-wrapper /opt/splunk/bin/python3.7m /opt/splunk/etc/apps/nlp-text-analytics/bin/cleantext.py
10-31-2024 23:21:59.382 ERROR ChunkedExternProcessor [3300462 ChunkedExternProcessorStderrLogger] - stderr: Failed to run splunk as SPLUNK_OS_USER. This command can only be run by bootstart user.
10-31-2024 23:21:59.382 ERROR ChunkedExternProcessor [3300462 ChunkedExternProcessorStderrLogger] - stderr: /opt/splunk/etc/apps/Splunk_SA_Scientific_Python_linux_x86_64/bin/linux_x86_64/bin/python: line 5: [: ==: unary operator expected
10-31-2024 23:22:00.690 INFO SearchParser [3300457 searchOrchestrator] - PARSING: inputlookup peter_pan.csv
10-31-2024 23:22:00.690 INFO AstOptimizer [3300457 searchOrchestrator] - SrchOptMetrics optimize_toJson=1.373341992
10-31-2024 23:22:00.690 INFO SearchParser [3300457 searchOrchestrator] - PARSING: | inputlookup "moby_dick.csv" | append [| inputlookup "peter_pan.csv"] | cleantext textfield=sentence keep_orig=true base_word=true remove_stopwords=false force_nltk_tokenize=true base_type="lemma_pos" term_min_len=1 ngram_mix=false
10-31-2024 23:22:00.690 INFO SearchParser [3300457 searchOrchestrator] - PARSING: | inputlookup "moby_dick.csv" | append [| inputlookup "peter_pan.csv"] | cleantext textfield=sentence keep_orig=true base_word=true remove_stopwords=false force_nltk_tokenize=true base_type="lemma_pos" term_min_len=1 ngram_mix=false
10-31-2024 23:22:00.690 INFO ServerConfig [3300457 searchOrchestrator] - Will add app jailing prefix /opt/splunk/bin/nsjail-wrapper for nlp-text-analytics
10-31-2024 23:22:00.690 INFO ChunkedExternProcessor [3300457 searchOrchestrator] - Running process: /opt/splunk/bin/nsjail-wrapper /opt/splunk/bin/python3.7m /opt/splunk/etc/apps/nlp-text-analytics/bin/cleantext.py
10-31-2024 23:22:00.746 ERROR ChunkedExternProcessor [3300527 ChunkedExternProcessorStderrLogger] - stderr: Failed to run splunk as SPLUNK_OS_USER. This command can only be run by bootstart user.
10-31-2024 23:22:00.746 ERROR ChunkedExternProcessor [3300527 ChunkedExternProcessorStderrLogger] - stderr: /opt/splunk/etc/apps/Splunk_SA_Scientific_Python_linux_x86_64/bin/linux_x86_64/bin/python: line 5: [: ==: unary operator expected
10-31-2024 23:22:01.406 INFO SearchParser [3300457 searchOrchestrator] - PARSING: | inputlookup "peter_pan.csv"
10-31-2024 23:22:01.410 INFO AstOptimizer [3300457 searchOrchestrator] - SrchOptMetrics optimize_toJson=0.717763582
Further to this, in experimenting trying to determine how much impact the actual source data makes;
Works:
| makeresults
| eval textcheck="My Text Here"
| fields textcheck
| cleantext textfield=textcheck keep_orig=true base_word=true remove_stopwords=false force_nltk_tokenize=true base_type="lemma_pos" term_min_len=1 ngram_mix=false
| append
[search sourcetype="o365:management:activity" (Operation="New-InboxRule" OR Operation="Set-InboxRule")
| head 1]
Works:
| makeresults
| eval textcheck="My Text Here"
| fields textcheck
| cleantext textfield=textcheck keep_orig=true base_word=true remove_stopwords=false force_nltk_tokenize=true base_type="lemma_pos" term_min_len=1 ngram_mix=false
| append
[search sourcetype="o365:management:activity" (Operation="New-InboxRule" OR Operation="Set-InboxRule")
]
Hi @geekusa , Further to this we are working with Splunk Support.
It currently seems like it might be an issue with recent update to NLP version, we are getting their cloud support team to roll back the version to compare.
Following are notes from our support case:
You would need to follow that up the NLP App Developer. - Let me summarise:
1. Everything working fine with NLP App version: 1.1.4
2. You upgraded some Apps and Add-ons including the NLP # 4066 to 1.2.0
- After this searches with the " cleantext " command started to error.
3. We had a look at the remote search.logs from the Indexers and saw:
11-06-2024 22:53:21.720 ERROR ChunkedExternProcessor [3490370 ChunkedExternProcessorStderrLogger] - stderr: File "/opt/splunk/var/run/searchpeers/
sh-i-<stack>.splunkcloud.com-1730933406/apps/nlp-text-analytics/bin/exec_anaconda.py", line 184, in get_system_paths
11-06-2024 22:53:21.720 ERROR ChunkedExternProcessor [3490370 ChunkedExternProcessorStderrLogger] - stderr: raise Exception(f'Unsupported platform: {system}')
11-06-2024 22:53:21.720 ERROR ChunkedExternProcessor [3490370 ChunkedExternProcessorStderrLogger] - stderr: Exception: Unsupported platform: ('Linux', 'aarch64')
4. I then double confirmed the processors that your Indexers have:
# idx-i-<stack>:~$ uname -a
Linux idx-i-0c97b29a8614e3b06 5.15.0-1038-aws #43~20.04.1-Ubuntu SMP Fri Jun 2 17:11:42 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
5. The only thing I can think of, is that there maybe something in the new exec_anaconda.py script ( this script, to do the Platform check ) that wasn't in the previous version,
and it gets tripped up on the fact that the CPUs or Architecture is Graviton ( == a subset of ARM or aarch64 )
https://aws.amazon.com/ec2/graviton/
Interesting find @lindonm. So the script exec_anaconda.py mostly comes line for line from the Python for Scientific Computing (for Linux 64-bit) app. At line 25 there is a section for supported systems (https://github.com/geekusa/nlp-text-analytics/blob/52930fc8b94b1406f55c9db9cc311fb1ba9f6f17/bin/exec_anaconda.py#L25). The Splunk app I mentioned only lists linux_x86_64 (I added the darwin_arm64 line which is like the SUPPORTED_SYSTEMS in the Splunk Machine Learning Toolkit app). So copying their examples, it seems the app Python for Scientific Computing doesn't support ARM processors for Linux (but does for Mac). If you still have a support case open, perhaps you can ask how does either of those 2 supported Splunk apps handle the same problem (Python app has a similar exec_anaconda.py and MLTK has util/base_util with SUPPORTED_SYSTEMS constant--assuming you have one or both of these installed and can confirm they work). If we can confirm that it works, then I can just add the line that they may provide.
Many thanks @geekusa ,
Splunk Cloud support tried to roll back the version, but
The manual App roll-back to version 1.1.4 got rejected over night, because the "Cloud Compatible" flag has been removed from that version !
- I'm not sure how I over-looked that, but would explain the: {"code":"400-bad-request","message":"App is not self-service installable
You could ask the developer to re-attach the Cloud Compatible flag to that version, - unless he had some other reasoning for taking it off.
( Maybe older, vulnerable Python etc. )
Would it be possible to re-add this flag?
Otherwise, i've been looking into Splunk_SA_Scientific_Python and exec_anaconda.py - I think (i'm definitely not a developer, just a dabbler) that the issue might be: 1) Splunk_SA_Scientific_Python/exec_anaconda.py has not been updated in 5 years 1.a) This script is never actually run directly - Other apps like yours load functions from Splunk_SA_Scientific_Python but that specific python code is never actually called (meaning it's been invalid for Splunk Cloud for a long time, but hasn't been noticed)
From your comments:
After executing this function, you can safely import the Python
libraries included in Splunk_SA_Scientific_Python (e.g. numpy).
And i'm trying to figure it out, but to me it looks like your version of exec_anaconda.py is only 3 months old, it was never used before that?
Would it be possible for you to add another supported platform to your code?
stderr: Exception: Unsupported platform: ('Linux', 'aarch64')
I'd do a pull request myself but not entirely sure how to :)
Splunk controls that flag, I believe it was removed because that version lost cloud compatibility, there is no option for me to add it back.
The script exec_anaconda.py I believe is just an example script, but something like MLTK definitely uses this (or really a variation of it). It is so you can use compiled libraries not included in normal python. In this case, the newer version of the nltk library requires a compiled library of regex on the host. This is where Splunk_SA_Scientific_Python and the exec_anaconda script makes that possible.
I've got to say I'm very confused that Splunk Cloud is even running on an ARM architecture, everything I can find shows that isn't supported other than for universal forwarders. Can you verify if MLTK works in your environment, as that would probably prove that is possible or not?
There only exist 4 versions of Splunk_SA_Scientific_Python (1 for Linux, 1 for Windows, and 2 for Mac), so if I were to add a line for the platform of aarch64
my only option would be to point it at 'linux_x86_64'
as the value. This is why I am asking if MLTK works, because otherwise that just doesn't seem like it would.
Thanks - I have tested trying to browse to the Splunk_SA_Scientific_Python and that currently fails (tries to load an Inputs page, nothing renders on page). Logs show a similar error to the first one I logged this issue with (unary operator expected ).
To be honest though, i've never tried to load that app before, so I don't know if it even should work like that. I have just tested adding it on a local dev box (Linux x64) and I got the status/inputs page for a completely different app (Website input) so NFI what's going on there.
I've had a browse around the MLTK app in Splunk Cloud, and everything there appears to work fine, no errors, can do the showcase items etc.
Our Splunk Cloud has been running on ARM (AWS Graviton) for almost 2 years now. ARM isn't officially supported for Splunk Enterprise, but it is for Cloud. (Possibly only Victoria experience, i'm not sure on that)
I've just installed MLTK on my dev box, and I can see that; exec_anaconda.py:
from util.base_util import (
get_apps_path,
get_mltk_pycache_path,
get_system_paths,
SUPPORTED_SYSTEMS,
PSC_PATH_PREFIX,
)
/utils/base_util.py
SUPPORTED_SYSTEMS = {
('Linux', 'x86_64'): 'linux_x86_64',
('Darwin', 'x86_64'): 'darwin_x86_64',
('Darwin', 'arm64'): 'darwin_arm64',
('Windows', 'AMD64'): 'windows_x86_64',
}
PSC_PATH_PREFIX = 'Splunk_SA_Scientific_Python_'
# originally moved from exec_anaconda.py
So there is no mention of ARM there either, but potentially there is a "cloud only" version of MLTK? However there is nothing in that exec_anaconda.py to throw an alert.
Your exec script seems to be a comination of the exec and base utils from MLTK.
The MLTK base_utils only has a single exception for unsupported system, and that is in get_system_paths. There is no exception in the exec_anaconda.py at all.
def get_system_paths():
if platform.system() == "Darwin" and "ARM64" in platform.version():
system = (platform.system(), "arm64")
else:
system = (platform.system(), platform.machine())
if system not in SUPPORTED_SYSTEMS:
raise Exception(f'Unsupported platform: {system}')
sa_scipy = f"{PSC_PATH_PREFIX}{SUPPORTED_SYSTEMS[system]}"
(Note that it does have in there platform=arm64 - maybe this is a better option for aarch64?)
Whereas your exec_anaconda.py has 2 exception calls;
#Line 73
sa_path, system = get_system_paths()
if system not in SUPPORTED_SYSTEMS:
raise Exception('Unsupported platform: %s %s' % (system))
sa_scipy = '%s%s' % (PSC_PATH_PREFIX, SUPPORTED_SYSTEMS[system])
#Line 178
def get_system_paths():
if platform.system() == "Darwin" and "ARM64" in platform.version():
system = (platform.system(), "arm64")
else:
system = (platform.system(), platform.machine())
if system not in SUPPORTED_SYSTEMS:
raise Exception(f'Unsupported platform: {system}')
All great finds--as I am flying blind on this one with not being able to recreate your environment. I removed the lines that appeared to be redundant from get_system_paths()
so this should look more like the way MLTK handles it. I uploaded version 1.2.1 to splunkbase. It is visible but it is not currently the default as it has to pass checks which in the past can take awhile. Are you able to test with this new version and if so can you report back the results?
Many thanks - Splunk Support have already requested to expedite the cloud vetting for v1.2.1 - I will let you know the outcome :)
Thanks @geekusa but unfortunately no luck.
Splunk Cloud have completed vetting of your app and we have installed v1.2.1, but still seeing the same error in search logs.
We're reaching out to Splunk Support to check if the back end logs show anything else or new.
Hi,
Unfortunately we're still seeing the same error;
11-15-2024 06:08:03.160 ERROR ChunkedExternProcessor [78900 ChunkedExternProcessorStderrLogger] - stderr: File "/opt/splunk/var/run/searchpeers/[sh-<stack>.splunkcloud.com]-1731650694/apps/nlp-text-analytics/bin/exec_anaconda.py", line 177, in get_system_paths
11-15-2024 06:08:03.160 ERROR ChunkedExternProcessor [78900 ChunkedExternProcessorStderrLogger] - stderr: raise Exception(f'Unsupported platform: {system}')
11-15-2024 06:08:03.160 ERROR ChunkedExternProcessor [78900 ChunkedExternProcessorStderrLogger] - stderr: Exception: Unsupported platform: ('Linux', 'aarch64')
11-15-2024 06:08:03.168 ERROR ChunkedExternProcessor [78837 RunDispatch] - EOF while attempting to read transport header read_size=0
Would you be willing to try either; 1) Add ('Linux', 'aarch64') into SUPPORTED_SYSTEMS or 2) Remove the exception entirely?
At least that way we'd be able to see if it does actually run.
Hi @lindonm , I can't remove the exception because it sets the system_path for the correct version of Python for Scientific Computing. So before I add the supported system to the list, can you verify the version of Python for Scientific Computing installed? I want to know that there will actually be a /opt/splunk/etc/apps/Splunk_SA_Scientific_Python_linux_x86_64 path to go to?
Thanks @geekusa ,
Currently installed version is 4.2.1, we can update to 4.2.2 if needed (according to release notes, 4.2.2 just deals with OpenSSL patch). https://splunkbase.splunk.com/app/2882
Looking at the ML Toolkit exec_anaconda.py, it uses a utils.py which yours doesn't seem to.
from util.base_util import (
get_apps_path,
get_mltk_pycache_path,
get_system_paths,
SUPPORTED_SYSTEMS,
PSC_PATH_PREFIX,
)
And the only mention of SUPPORTED_SYSTEMS in that file is line 86
sa_path, system = get_system_paths()
system_path = os.path.join(sa_path, 'bin', '%s' % (SUPPORTED_SYSTEMS[system]))
In util/base_util.py it has
SUPPORTED_SYSTEMS = {
('Linux', 'x86_64'): 'linux_x86_64',
('Darwin', 'x86_64'): 'darwin_x86_64',
('Darwin', 'arm64'): 'darwin_arm64',
('Windows', 'AMD64'): 'windows_x86_64',
}
PSC_PATH_PREFIX = 'Splunk_SA_Scientific_Python_'
and
def get_system_paths():
if platform.system() == "Darwin" and "ARM64" in platform.version():
system = (platform.system(), "arm64")
else:
system = (platform.system(), platform.machine())
if system not in SUPPORTED_SYSTEMS:
raise Exception(f'Unsupported platform: {system}')
sa_scipy = f"{PSC_PATH_PREFIX}{SUPPORTED_SYSTEMS[system]}"
sa_path = os.path.join(get_apps_path(), sa_scipy)
if not os.path.isdir(sa_path):
raise Exception(f'Failed to find Python for Scientific Computing Add-on ({sa_scipy})')
return sa_path, system
I don't really understand why this would be different, as that now seems to be the same as yours.
I can't get to the filesystem of our cloud environment, but the URL is
https://
So that would mean that
sa_scipy = f"{PSC_PATH_PREFIX}{SUPPORTED_SYSTEMS[system]}"
must evaluate to Splunk_SA_Scientific_Python_linux_x86_64 - But the exception states ('Linux', 'aarch64') - So i'm really at a loss there, as NLP has been working just fine until very recently (when we updated a number of apps at the same time)
My only assumption is that MLTK isn't actually using PSC? (or maybe only uses it in some circumstances that we haven't experienced yet))
There's no specific PSC app for ARM other than "Mac Apple Silicon" - I just downloaded that to my local box and extracted it and can confirm it's path is Splunk_SA_Scientific_Python_darwin_arm64.
To add to the confusion, I just checked in our cloud app management and all 4 versions show up with an "Install" button
If we were to override supported systems with this;
SUPPORTED_SYSTEMS = {
('Linux', 'x86_64'): 'linux_x86_64',
('Linux', 'aarch64'): 'linux_x86_64',
('Darwin', 'x86_64'): 'darwin_x86_64',
('Darwin', 'arm64'): 'darwin_arm64',
('Windows', 'AMD64'): 'windows_x86_64',
}
Then if the path doesn't exist there is still a second check
if not os.path.isdir(sa_path):
raise Exception(f'Failed to find Python for Scientific Computing Add-on ({sa_scipy})')
So would that be enough?
I'll follow up with Splunk Support to confirm what actual version of PSC we should be using.
I think the URL from your environment is enough of a clue that the path will exist (thanks for checking that). I agree with your last comment and have added the line for the second supported Linux system (aarch64) pointing to the linux_x86_64 path. It is in version 1.2.2 and like before, awaiting approval.
Thanks.
Thanks @geekusa i'll let you know how we go :)
Hi @geekusa Splunk Cloud is now updated to v1.2.2, but we are still seeing the same error in the UI. I'm reaching out to Splunk Support to check if it's still the same back end error.
Potentially related to recent update to Splunk_SA_Scientific_Python_linux_x86_64 - We are attempting to downgrade that app, but as we are Splunk Cloud, and that app is >500mb, we are unable to do so ourselves and are waiting on support team.
The following search fails with an error:
Error log details
If however I run this search, the search runs as expected with no erros:
Also this search works as well, by limiting the results?
I have experimented with multiple numbers of results from "| head 1" to "|head 10000" - They all work but as soon as I remove the head command it fails. Note that in my selected time period there are only 24 entries, so even with "|head 1000" it works fine, but as soon as I remove that it fails with error.
Splunk Cloud Version:9.2.2406.107 (Victoria)
nlp-text-analytics v1.2.0 Splunk_SA_Scientific_Python_linux_x86_64 v4.2.1 Splunk_ML_Toolkit v5.4.2