google / schedviz

A tool for gathering and visualizing kernel scheduling traces on Linux machines
Apache License 2.0
519 stars 34 forks source link

Failed to upload trace file trace.tar.gz - Internal Server Error #8

Open juanrubio opened 5 years ago

juanrubio commented 5 years ago

Hi,

I'm getting 'Failed to upload trace file trace.tar.gz Reason: Internal Server Error' while trying to upload a tarball created with this command:

$ sudo ./trace.sh -capture_seconds 10 --out ftrace

What am I missing?

Ubuntu 18.04 yarn 1.19.1 node 10.16.3

sabarabc commented 5 years ago

What is the output of the command, and could you share the tarball? Note that it will reveal what processes are running on your machine, so only share it if that's ok with you.

juanrubio commented 5 years ago

Thanks for the quick response.

Not sure what happened there, but I have restarted the server and run again the trace.sh tool to obtain a new tarball. Now everything seems to be working fine, so I'm closing the issue.

Thanks!

sabarabc commented 5 years ago

There might be a bug revealed by the first tarball you created, as they are highly dependent on what is running on the machine. If you still have it, I'd like to take a look at it.

juanrubio commented 5 years ago

I'm afraid I don't have it anymore. I did build and install kernelshark in between these two runs. But I believe I have not run trace-cmd during that time. I did run kernelshark with trace data collected using trace-cmd on an embedded device.

juanrubio commented 5 years ago

I thought I wasn't, but I think I must have been using the wrong trace file when I got the 'Internal Server Error' message.

What happens is that I've been trying to visualize the traces collected from two devices, my Linux desktop running Ubuntu, and an embedded device running an embedded Linux. For some reason, the traces collected from my embedded device are not compatible with schedviz.

I've created a new tarball with traces from this device. Note that I had to modify the trace.sh script to remove bashisms as my device is running Busybox. Also, note that the device has a single-core CPU. The topology directory lives under:

/sys/devices/system/cpu/cpu0

I'm attaching both the traces tarball and the modified trace.sh script tarces.tar.gz

embedded_trace.sh.tar.gz

.

tjake commented 5 years ago

I am also having this issue. I generated a trace on one host and would like to analyze it from another

tjake commented 5 years ago

@sabarabc the actual error I get is:


Reason:
 Internal Server Error:
Failed to upload trace file: no format found with id: 529
Page: 0 Page Timestamp: 81224889073 Event Index: 0 
Bitfield: d
Data:
00000000  11 02 00 00 da 0a 00 00  14 00 20 00 00 80 08 00  |.......... .....|
00000010  68 51 00 00 2f 6c 69 62  2f 78 38 36 5f 36 34 2d  |hQ../lib/x86_64-|
00000020  6c 69 6e 75 78 2d 67 6e  75 2f 6c 69 62 63 2e 73  |linux-gnu/libc.s|
00000030  6f 2e 36 00                                       |o.6.|
sabarabc commented 5 years ago

@juanrubio Sorry for the delay. I took a look at your trace and it appears to have around 20 pages worth of garbage data before the first real page containing tracing data. Your trace also contains events that whose formats were not included in the tar (i.e. not specified in the shell script). I see events with format IDs 6 and 14, which are not the scheduling events. Could you confirm if your trace buffers are cleared before starting the trace and that nothing else is tracing at the same time?

I've modified SchedViz to be able to handle 32 bit traces (you appear to be tracing on a 32 bit machine, is that correct?) and be more tolerant of missing formats (when passed the fail_on_unknown_event_format=false flag). After deleting the garbage pages from your trace using a hex editor, I'm able to load your trace.

Also, we've added support for recording traces using ebpf which doesn't require the formats. You can try it by collecting a trace using the collect.sh script (you'll need to upload sched.bt and have bpftrace installed as well.) Let me know if this works for you.

sabarabc commented 5 years ago

@tjake Can you create a new issue and share your trace.tar.gz file?