datastax / diagnostic-collection

Diagnostic Collector for Apache Cassandra
Apache License 2.0
28 stars 35 forks source link

Finding *-Statistics.db in multiple 'data_file_directories' locations fails #121

Closed joelsdc closed 2 years ago

joelsdc commented 2 years ago

ds-collect latest (v2.0.2) / Ubuntu 16

when running the collector I’ve noticed in the logs the following error:

    executing `find /storage/cassandra/data,/storage2/cassandra/data,/storage3/cassandra/data -maxdepth 3 -name *-Statistics.db -exec cp --parents {} /tmp/datastax/20db1.lax1.gogii.net_artifacts_2022_04_21_1232_1650569548/sstable-statistics/ ; > `… find: ‘/storage/cassandra/data,/storage2/cassandra/data,/storage3/cassandra/data’: No such file or directory
failed

It seems like an find is trying to search in the one unique path called /storage/cassandra/data,/storage2/cassandra/data,/storage3/cassandra/data instead of 3 different paths.

I tried manually searching for Statistic files and we have plenty:

root@20db1:~# find /storage/cassandra/data -maxdepth 3 -name *-Statistics.db | wc -l
2400
root@20db1:~# find /storage2/cassandra/data -maxdepth 3 -name *-Statistics.db | wc -l
2266
root@20db1:~# find /storage3/cassandra/data -maxdepth 3 -name *-Statistics.db | wc -l
2401
root@20db1:~#

If I run the find the same way the collector does, it fails:

root@20db1:~# find /storage/cassandra/data,/storage2/cassandra/data,/storage3/cassandra/data -maxdepth 3 -name *-Statistics.db
find: ‘/storage/cassandra/data,/storage2/cassandra/data,/storage3/cassandra/data’: No such file or directory
root@20db1:~#

I believe the issue is with the find cmd concatenating multiple search paths, you must do that without concatenating:

root@20db1:~# find /storage/cassandra/data /storage2/cassandra/data /storage3/cassandra/data -maxdepth 3 -name *-Statistics.db | wc -l
7063
root@20db1:~#

I've narrowed down the issue to:

https://github.com/datastax/diagnostic-collection/blob/fe13ab0ba54269d2975e2d089e2251c6a70afee4/ds-collector/ds-collector#L556

The fix for that specific line to have a result of 3 different paths instead of one for $cassandra_data_dir would be to replace: tr $'\n' ',' with tr $'\n' ' '

    cassandra_data_dir=$(sed -n '/^data_file_directories:/,/^[^- ]/{//!p;};/^data_file_directories:/d' "$configHome/cassandra.yaml" | grep -e "^[ ]*-" | sed -e "s/^.*- *//" | tr $'\n' ' ' | sed -e "s/.$/\n/")

The problem with that fix is that for sure it's going to break things further down, as later some other parsing does expect the ',' there to continue working correctly... so changes are required in multiple places or add new code to address this.

Thanks! Joel.

michaelsembwever commented 2 years ago

duplicate of https://github.com/datastax/diagnostic-collection/issues/100

though the analysis and write up here is super valuable, thanks @joelsdc !

this presents itself a super simple fix here: https://github.com/datastax/diagnostic-collection/blob/master/ds-collector/ds-collector#L477

  export data_dir=${cassandra_data_dir/,/ }

this is because the script can be left to work with the value as comma-separated, while then the rust command can now work with it as space-separated.

joelsdc commented 2 years ago

Awesome! Running it again with that fix!

Do you want me to open a PR or you got it?

Thanks @michaelsembwever !!

michaelsembwever commented 2 years ago

Yes, (thanks!), I'd definitely rather see a commit with your name on it 😎

joelsdc commented 2 years ago

I tried with your suggestion and we almost had it:

    executing `find /storage/cassandra/data /storage2/cassandra/data,/storage3/cassandra/data -maxdepth 3 -name *-Statistics.db -exec cp --parents {} /tmp/datastax/node14.example.com_artifacts_2022_04_22_143
0_1650663034/sstable-statistics/ ; > `… find: ‘/storage2/cassandra/data,/storage3/cassandra/data’: No such file or directory
failed

I think the good-good one will be:

  export data_dir=${cassandra_data_dir//,/ }

(Notice the double // after the variable name to enforce "replace-all" instead of "replace-first-occurrence")

I'll test tonight and if it works as expected I'll submit a PR.

👍

michaelsembwever commented 2 years ago

(Notice the double // after the variable name to enforce "replace-all" instead of "replace-first-occurrence")

ooh! TIL