vzaigrin / UniversalStorageCollector

tool for gathering performance information from different storage system
19 stars 2 forks source link

Vplex parsing error #11

Open ollivers opened 6 years ago

ollivers commented 6 years ago

https://vzaigrin.wordpress.com/2016/07/21/vplexcollector-tool-for-gathering-emc-vplex/

Nearly all metrics are collected, but sometimes (every two oder three hours) we encounter the folloging "unparseable date" error:

Nov 13 05:44:09 mon-oed-115-0 java[18976]: [ERROR] [11/13/2018 05:44:09.253] [USC-akka.actor.default-dispatcher-13] [akka://USC/user/cluster-2.vplex] Unparseable date: "Time" Nov 13 05:44:09 mon-oed-115-0 java[18976]: java.text.ParseException: Unparseable date: "Time" Nov 13 05:44:09 mon-oed-115-0 java[18976]: at java.text.DateFormat.parse(DateFormat.java:366) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at universalstoragecollector.VPLEX$$anonfun$ask$1.apply(VPLEX.scala:75) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at universalstoragecollector.VPLEX$$anonfun$ask$1.apply(VPLEX.scala:61) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at scala.collection.immutable.List.foreach(List.scala:381) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at universalstoragecollector.VPLEX.ask(VPLEX.scala:61) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at universalstoragecollector.Storage$$anonfun$receive$1.applyOrElse(Storage.scala:10) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at akka.actor.Actor$class.aroundReceive(Actor.scala:484) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at universalstoragecollector.Storage.aroundReceive(Storage.scala:5) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at akka.actor.ActorCell.invoke(ActorCell.scala:495) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at akka.dispatch.Mailbox.run(Mailbox.scala:224) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at akka.dispatch.Mailbox.exec(Mailbox.scala:234) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) Nov 13 05:44:09 mon-oed-115-0 java[18976]: at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Example montor:

VPlexcli:/> ll /monitoring/directors/director-2-3-A/monitors/director-2-3-A_director-2-3-A_fe-lu/

/monitoring/directors/director-2-3-A/monitors/director-2-3-A_director-2-3-A_fe-lu:

Attributes: Name Value


average-period 5min bucket-count 64 bucket-max - bucket-min - bucket-width - collecting-data true firmware-id 7 idle-for 1.77min ownership true period 5min statistics [fe-lu.ops, fe-lu.read, fe-lu.read-avg-lat, fe-lu.write, fe-lu.write-avg-lat] targets device_Symm0522_00C0_1_vol, device_Symm0522_00C1_1_vol, device_Symm0522_00C5_1_vol, device_Symm0522_1480_1_vol, device_Symm0522_1481_1_vol, device_Symm0522_1482_1_vol, device_Symm0522_1483_1_vol, device_Symm0522_1484_1_vol, device_Symm0522_1485_1_vol, device_Symm0522_1486_1_vol, ... (1999 total) version -

Contexts: Name Description


sinks Contains all of the sinks set up to collect data from this performance monitor.

VPlexcli:/> ll /monitoring/directors/director-2-3-A/monitors/director-2-3-A_director-2-3-A_fe-lu/sinks/

/monitoring/directors/director-2-3-A/monitors/director-2-3-A_director-2-3-A_fe-lu/sinks: Name Enabled Format Sink-To


file true csv /var/log/VPlex/cli/reports/director-2-3-A_fe-lu.csv

Best regards, Oliver

Originally posted by @ollivers in https://github.com/vzaigrin/UniversalStorageCollector/issues/1#issuecomment-438158102

vzaigrin commented 6 years ago

Hello.

It's look like a header in CSV file. You could ignore it.

Best regards, Vadim Zaigrin

ollivers commented 6 years ago

Hello Vadim,

sure it's the first header line of the monitor sink csv file - but afaik it might be essential as this header contains all metric names usc gathers and drops into carbon, i.e:

Time,fe-director.read-lat recent-average (us),fe-director.write-lat recent-average (us),director.fe-ops-q (counts),director.fe-read (KB/s),director.be-write (KB/s),director.fe-ops-write (counts/s),director.be-ops (counts/s),director.fe-write (KB/s),director.be-read (KB/s),director.be-ops-read (counts/s),director.fe-ops-read (counts/s),director.be-ops-write (counts/s),director.busy (%),director.fe-ops (counts/s),director.fe-ops-act (counts),director.per-cpu-busy CPU11 (%),director.per-cpu-busy CPU22 (%),director.per-cpu-busy CPU12 (%),director.per-cpu-busy CPU23 (%),director.per-cpu-busy CPU13 (%),director.per-cpu-busy CPU14 (%),director.per-cpu-busy CPU15 (%),director.per-cpu-busy CPU16 (%),director.per-cpu-busy CPU17 (%),director.per-cpu-busy CPU18 (%),director.per-cpu-busy CPU19 (%),director.per-cpu-busy CPU0 (%),director.per-cpu-busy CPU1 (%),director.per-cpu-busy CPU2 (%),director.per-cpu-busy CPU3 (%),director.per-cpu-busy CPU4 (%),director.per-cpu-busy CPU5 (%),director.per-cpu-busy CPU6 (%),director.per-cpu-busy CPU7 (%),director.per-cpu-busy CPU8 (%),director.per-cpu-busy CPU9 (%),director.per-cpu-busy CPU20 (%),director.per-cpu-busy CPU10 (%),director.per-cpu-busy CPU21 (%) 2018-11-06 12:59:47,175.643,1437.909,0,82722,64153,2278,7458,17641,32475,2433,8926,5025,22,11204,2,0,0,2,0,1,0,1,5,99,0,0,26,7,0,0,22,100,0,0,0,1,0,9,0 2018-11-06 13:04:46,307.629,2246.358,0,139215,122233,2997,10107,43692,77986,3201,9403,6906,23,12399,10,0,0,3,0,1,1,0,0,99,0,1,36,8,0,6,23,100,0,0,0,0,0,9,0

Every sink file usc is collecting from has exactly one header line starting with "Time":

service@rz2-vplex-02:/var/log/VPlex/cli/reports> for i in ls -1 direct*.csv; do LINEF=grep ^Time $i | wc -l; if [ $LINEF -gt 1 ]; then echo "More than one time line"; fi; done service@rz2-vplex-02:/var/log/VPlex/cli/reports>

so in nearly every case, usc is able to parse these time header lines correctly but in some case we see the parsing error above. I'm asking because we observe gaps in metric collection, and this error might be an explaination, but i'm still unsure about this. As we are using a quite huge amount of virtual volumes (2000) , target based vplex monitor are creating very long lines of header and metric data:

service@rz2-vplex-02:/var/log/VPlex/cli/reports> head -1 director-2-3-A_fe-lu.csv | wc -c 489853

which might be a problem too.

Best regards, Oliver