Closed Mohkam closed 3 years ago
Can you post your sqlxewriter.toml (and please clean up any passwords)?
Can you also post the result of this query:
SELECT @@SERVERNAME AS [ServerName]
,COALESCE(DEFAULT_DOMAIN(), '') AS [DomainName]
,COALESCE(CAST(SERVERPROPERTY('MachineName') as nvarchar(128)), @@SERVERNAME) AS [Computer]
,CAST(COALESCE(SERVERPROPERTY('ProductLevel'), '') as nvarchar(128)) AS ProductLevel
,COALESCE(CAST(SERVERPROPERTY('ProductMajorVersion') as NVARCHAR(128)) + '.' + CAST(SERVERPROPERTY('ProductMinorVersion') as NVARCHAR(128)),'') AS ProductRelease
,COALESCE(CAST(SERVERPROPERTY('ProductVersion') AS NVARCHAR(128)), '') as [ProductVersion];
Ah, I see the issue. Give me just a minute. I don't have a great Linux test environment.
Thank you so much! it returns nothing for domain,
My TOML:
[elastic]
addresses = ["http://X:X:x:x:9200/"]
username = "xxxx"
password = "xxxxxxx"
auto_create_indexes = true
default_index = "dev-sql-extended-events"
[filesink]
retain_hours = 4000
[app]
http_metrics = true
http_metrics_port = 6061
watch_config = true
# adds = [ "global.host:$(HOST)", # $(VAR) must be upper case. the resulting value is lower case.
# "global.app:$(EXE)"
# ]
# copies = [ "@timestamp:newtimestamp" ]
# moves = [ ]
[defaults]
sessions = ["system_health", "AlwaysOn_health"]
look_back = "1680000h" # one week
rows = 20000 # Process batches of 20000 at a time. a good compromise. 0 or omitted processes all the rows
timestamp_field_name = "@timestamp"
payload_field_name = "mssql" # all the XE events are under this field
agentjobs = "all" # (all|failed|none) - process SQL Server agent job history
excludedEvents = [
"connectivity_ring_buffer_recorded",
"memory_broker_ring_buffer_recorded",
"scheduler_monitor_system_health_ring_buffer_recorded",
"security_error_ring_buffer_recorded"
]
[[source]]
fqdn = "x:x:x:x"
user = "sa"
password = "xxxxxx"
I just pushed a fix. It looks like you are building locally? If not, I can also try to deploy a new binary -- but that may be a little later.
I am building locally, i will share the results shortly.
I am sorry if I am requesting too much. I am facing same error but on second and onward instances. I believe the reason is that Server name of SQL (as on windows deployment) truncates to 15 characters (NetBIOS name limit). even on Linux :( . Would it be possible to add an optional tie breaker in TOML config and make it part of duplicate detection logic, something like:
[[source]]
fqdn = "x:x:x:x"
UniqueID = "201"
so its unique identity in duplicate logic would be CurrentID(ServerName maybe) +UniqueID
My Tomel (pasting delta wrt previous message) :
[[source]]
fqdn = "x:x:x:201"
user = "sa"
password = "xxxxxx"
[[source]]
fqdn = "x:x:x:202"
user = "sa"
password = "xxxxxx"
[[source]]
fqdn = "x:x:x:203"
user = "sa"
password = "xxxxxx"
LOG:
TRAC[15552] app.program.startpolling...
DEBU[15552] desired config file: D:\GitHub\xelogstash\deploy\windows\sqlxewriter\sqlxewriter.toml
DEBU[15552] checking sources file: D:\GitHub\xelogstash\deploy\windows\sqlxewriter\sqlxewriter_sources.toml
INFO[15552] config file: D:\GitHub\xelogstash\deploy\windows\sqlxewriter\sqlxewriter.toml
INFO[15552] default poll interval: 60s
INFO[15552] pprof available at http://localhost:6061/debug/pprof
INFO[15552] expvars available at http://localhost:6061/debug/vars
INFO[15552] metrics available at http://localhost:6061/debug/metrics
INFO[15552] sources: 4; default rows: 100
INFO[15553] sink: files: D:\GitHub\xelogstash\deploy\windows\sqlxewriter\events\sqlevents_YYYYMMDD_HH.events (keep: 240h0m0s)
INFO[15553] sink: elastic: http://10.10.10.207:9200
TRAC[15554] loop: true
TRAC[15554] poll routine launched: windows-service source=10.10.10.201
INFO[15554] 10.10.10.201: polling interval: 60s source=10.10.10.201
TRAC[15554] poll routine launched: windows-service source=10.10.10.203
TRAC[15554] poll routine launched: windows-service source=10.10.10.204
TRAC[15554] poll routine launched: windows-service source=10.10.10.202
TRAC[15560] source: 10.10.10.201; polling (#1)... source=10.10.10.201
TRAC[15560] entering ProcessSource source=10.10.10.201
TRAC[15560] source: 10.10.10.201; sessions: 3; events after: 2019-05-30T01:10:32+05:00 source=10.10.10.201
DEBU[15560] user: sa source=10.10.10.201
TRAC[15566] tpsj-dev-sj-mss: sys.messages for login_failed: 37 instance=tpsj-dev-sj-mss source=10.10.10.201
TRAC[15566] 10.10.10.201: starting: session: system_health instance=tpsj-dev-sj-mss session=system_health source=10.10.10.201
INFO[15568] [0] Source: tpsj-dev-sj-mss (system_health); 'Start At' skipped at least one event
DEBU[15568] tpsj-dev-sj-mss () session: system_health; events: 4; per_second: 1 duration_ms=2241 events=4 events_per_sec=1 instance=tpsj-dev-sj-mss session=system_health source=10.10.10.201
TRAC[15568] 10.10.10.201: starting: session: activity_tracking instance=tpsj-dev-sj-mss session=activity_tracking source=10.10.10.201
INFO[15569] 10.10.10.202: polling interval: 60s source=10.10.10.202
TRAC[15569] 10.10.10.201: starting: session: telemetry_xevents instance=tpsj-dev-sj-mss session=telemetry_xevents source=10.10.10.201
ERRO[15569] source: 10.10.10.201 () - XE - telemetry_xevents : validatesession: no file target instance=tpsj-dev-sj-mss session=telemetry_xevents source=10.10.10.201
DEBU[15571] tpsj-dev-sj-mss () session: agent_jobs; events: 0 duration_ms=2511 events=0 events_per_sec=0 instance=tpsj-dev-sj-mss session=agent_jobs source=10.10.10.201
ERRO[15571] instance: tpsj-dev-sj-mss; err: errors occurred - see previous source=10.10.10.201
ERRO[15575] skipping duplicate: fqdn: '10.10.10.202'; domain: ''; server: 'tpsj-dev-sj-mss': duplicate domain-instance
INFO[15584] 10.10.10.203: polling interval: 60s source=10.10.10.203
ERRO[15590] skipping duplicate: fqdn: '10.10.10.203'; domain: ''; server: 'tpsj-dev-sj-mss': duplicate domain-instance
INFO[15599] 10.10.10.204: polling interval: 60s source=10.10.10.204
ERRO[15606] skipping duplicate: fqdn: '10.10.10.204'; domain: ''; server: 'tpsj-dev-sj-mss': duplicate domain-instance
INFO[15614] metrics: alloc: 5.0mb; sys: 24.7mb; goroutines: 21; uptime: 4h20m; sources: 4
Details that are specific to my case, i am not sure if i should mention it:
Well that is annoying. But good job figuring out what it is doing 👍
I'm not sure it's that simple. I use the results of @@SERVERNAME
to populate quite a few fields in there. And without that, it will be hard to separate these boxes. Let's see what data we do have. Can you run the following two queries on one or two of the servers and post the results?
SELECT COALESCE(serverproperty('InstanceName'), '<NULL>') AS InstanceName
, COALESCE(serverproperty('MachineName') , '<NULL>') AS MachineName
, COALESCE(serverproperty('ServerName') , '<NULL>') AS ServerName
select [name] from sys.servers where server_id = 0
Please excuse my ignorance of the matter. Here are snapshots of 2 of them. Although all 4 of my test env show the same.
Your screen shots were perfect. I just upload new code that supports the following. For each source, you can define two additional fields:
server_name_override
allows you to override @@SERVERNAME
and SERVERPROPERTY('MachineName')
returned by a source. This is useful for Linux servers inside containers that set long machine names and SQL Server only returns the first 15 characters. This supports names longer than 15 characters.domain_name_override
allows you to override DEFAULT_DOMAIN()
returned by a source. This is useful for Linux servers that don't technically belong to a domain so they can appear to be in one.I don't think you need to set the domain name override. But the server name override will allow you set the value that SQL Server returns as the server name. That should give you enough flexibility for most any situation. Let me know if that works.
Thank you so much - server_name_override has helped me out. Now that i have it running for me. I will go through the extracted events.
Great! I'm glad that worked for you.
when trying to get extended event from ubuntu based sql server in kubernetes , i get error invalid value domain...
Here is my sample SQL pod deployment
Would you please guide if there is anyway to override domain name, since sql is running in linux. Or any help on how i can use it for linux based sql servers as well?
Thank you so much for this amazing effort.