Samraksh / eMote

eMote OS -- Multiple Ports (using .NET MF v4.3)
0 stars 0 forks source link

emote: use of floating point in MF #164

Closed Nathan-Stohs closed 10 years ago

Nathan-Stohs commented 10 years ago

The other day I noticed that even in our basic Hello World app, the interpreter in TinyCLR was trying to call floating point library functions. Would be good to get a better understanding of the situation. Again this is with the interpreter, not one of our drivers or other implementations.

My feeling is that FP should not be something that just happens...

AnishAtSamraksh commented 10 years ago

Yes. Is there a proposal?

From: Nathan Stohs [mailto:notifications@github.com] Sent: Thursday, July 10, 2014 12:50 PM To: Samraksh/MF Subject: [MF] emote: use of floating point in MF (#164)

The other day I noticed that even in our basic Hello World app, the interpreter in TinyCLR was trying to call floating point library functions. Would be good to get a better understanding of the situation. Again this is with the interpreter, not one of our drivers or other implementations.

My feeling is that FP should not be something that just happens...

— Reply to this email directly or view it on GitHubhttps://github.com/Samraksh/MF/issues/164.

Nathan-Stohs commented 10 years ago

This is a straight-forward characterization... somebody just needs to trace through some executions and gain an understanding of when, why, where, and how often TinyCLR is trying to use floating point. From there we determine if action is needed. Or maybe the result is that I'm wrong and FP in fact does not happen.

At this point I haven't actually determined it is a problem, I just don't think this should happen and the issue merits investigation into why. Otherwise there is no smoking gun that this is for example causing a performance issue or anything like that.

I put the task up to keep it on the company radar and it is otherwise open. If somebody wishes to pick this up they can assign themselves. Otherwise I would presume to pick this up myself at some point in the future.

MichaelAtSamraksh commented 10 years ago

If you see floating point instructions inside libc functions, this may be due to how Mentor Graphics chose to compile newlib libc.a. Newlib may be configured to almost fully avoid using any floating point variables.

Otherwise, TinyCLR uses floating point variables when it manipulates the HeapBlock, and also whenever managed code does certain type conversions involving floating point numbers. For debug builds, CLR_RT_HeapBlock::StoreToReference calls the conversion function. It should be possible to set TINYCLR_EMULATED_FLOATINGPOINT and avoid all CLR calls to floating point math library functions.

I don't see anything in a vanilla HelloWorld that would trigger a conversion. Disassembly of HelloWorld:

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  // Code size       20 (0x14)
  .maxstack  1
  .locals init ([0] bool CS$4$0000)
  IL_0000:  nop
  IL_0001:  br.s       IL_0010
  IL_0003:  nop
  IL_0004:  ldstr      "program2 Hello World!"
  IL_0009:  call       void [Microsoft.SPOT.Native]Microsoft.SPOT.Debug::Print(string)
  IL_000e:  nop
  IL_000f:  nop
  IL_0010:  ldc.i4.1
  IL_0011:  stloc.0
  IL_0012:  br.s       IL_0003
} // end of method HelloWorld::Main

So I compiled a different custom HelloWorldMF and opened HelloWorldMF.exe inside ildasm.exe (Intermediate Language Disassembler). (Unfortunately ildasm does not directly recognize the packed MSpotV1 PE files). Here is HelloWorldMF:

        public static void Main() {
            long counter = 0;
            while (true) {
                Debug.Print(
                    Resources.GetString(Resources.StringResources.String1) + (counter++).ToString());
                System.Threading.Thread.Sleep(5000);
            }
        }

Here is the output of HellowWorldMF Main(), with "Show source lines":

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  // Code size       58 (0x3a)
  .maxstack  4
  .locals init ([0] int64 counter,
           [1] int64 CS$0$0000,
           [2] bool CS$4$0001)
  .language '{3F5162F8-07C6-11D3-9053-00C04FA302A1}', '{994B45C4-E6E9-11D2-903F-00C04FA302A1}', '{5A869D0B-6611-11D3-BD2A-0000F80849BD}'
// Source File 'c:\Users\researcher\Documents\Visual Studio 2012\Projects\HelloWorldMF\HelloWorldMF\HelloWorldMF.cs' 
//000011:         public static void Main() {
  IL_0000:  nop
//000012:             long counter = 0;
  IL_0001:  ldc.i4.0
  IL_0002:  conv.i8
  IL_0003:  stloc.0
//000013:             while (true) {
//000014:                 Debug.Print(
//000015:                     Resources.GetString(Resources.StringResources.String1) + (counter++).ToString());
//000016:                 System.Threading.Thread.Sleep(5000);
//000017:             }
//000018:         }
//000019: 
//000020:     }
//000021: }
  IL_0004:  br.s       IL_0036
//000013:             while (true) {
  IL_0006:  nop
//000014:                 Debug.Print(
  IL_0007:  ldc.i4     0x4cc
  IL_000c:  call       string HelloWorldMF.Resources::GetString(valuetype HelloWorldMF.Resources/StringResources)
  IL_0011:  ldloc.0
  IL_0012:  dup
  IL_0013:  ldc.i4.1
  IL_0014:  conv.i8
  IL_0015:  add
  IL_0016:  stloc.0
  IL_0017:  stloc.1
  IL_0018:  ldloca.s   CS$0$0000
  IL_001a:  call       instance string [mscorlib]System.Int64::ToString()
  IL_001f:  call       string [mscorlib]System.String::Concat(string,
                                                              string)
  IL_0024:  call       void [Microsoft.SPOT.Native]Microsoft.SPOT.Debug::Print(string)
  IL_0029:  nop
//000015:                     Resources.GetString(Resources.StringResources.String1) + (counter++).ToString());
//000016:                 System.Threading.Thread.Sleep(5000);
  IL_002a:  ldc.i4     0x1388
  IL_002f:  call       void [mscorlib]System.Threading.Thread::Sleep(int32)
  IL_0034:  nop
//000017:             }
  IL_0035:  nop
//000013:             while (true) {
  IL_0036:  ldc.i4.1
  IL_0037:  stloc.2
//000014:                 Debug.Print(
//000015:                     Resources.GetString(Resources.StringResources.String1) + (counter++).ToString());
//000016:                 System.Threading.Thread.Sleep(5000);
//000017:             }
//000018:         }
//000019: 
//000020:     }
//000021: }
  IL_0038:  br.s       IL_0006
} // end of method HelloWorldMF::Main

It looks to me like there could be scaling going on at IL_0002 and IL0014. However, converting i4 to i8 does not trigger execution of the floating point section of Convert_Internal.

This does bring up a separate issue. While at OSU, I recall being frustrated that UInt16 C# variables were being converted before different operations, wasting even more cycles. In the above program, it looks like using a long variable causes MSIL instruction conv to be called even though the long is only printed out and incremented.

So, Nathan, I can't replicate this issue with the Interpreter calling floating point functions while executing Hello World. Were you able to use gdb's backtrace to find the offending CLR library?

Nathan-Stohs commented 10 years ago

The problem is the latter. A build with -mfloat-abi=softfp will hard fault due to a bogus FP instruction. You mentioned HeapBlock, and that in fact is where we hit the skids, specifically from my backtrace:

CLR_INT32 CLR_RT_HeapBlock::Compare_Values( const CLR_RT_HeapBlock& left, const CLR_RT_HeapBlock& right, bool fSigned )

So it sounds like your intuition may be correct.

I too have pondered if we should just go ahead and use the fixed point mode if this is actually causing a performance issue. However I'm not sure of what improvement to expect, if any. But that is a fix or workaround, the real question here is why.

TINYCLR_EMULATED_FLOATING_POINT does not actually build at all. Setting this option just enables a fixed point engine however it won't actually build without additional pre-processor parameters (fixed point width, etc.). It's not even clear if it is actually supported or is some orphaned feature.

EDIT: Just wanted to point out that of course softfp is not supposed to work because it requires an FPU. This was happening accidentally in master builds for a few days before it was fixed. But if there are no FP calls, then this bug should be moot (or should it? Maybe this is not true). So that trivial bug has potentially exposed something more subtle.

EDIT2: The hard fault occurs before GDB gets to a particular line of code, I suspect the fault hits early on as soon as the instruction gets in the pipeline. But I recall from the asm dump that the first thing that jumped out at me were the presense of VPOP or VPUSH... which now that I think about it may mean that we don't actually have a problem (see edit 3 below).

EDIT3: One possibility of course is that the build error is the only real error and this is all harmless. I want it to remain clear that there is no smoking gun or evidence of a crime here, just something that needs understanding. It could be that the bad build option was including completely unrelated boilerplate code.

MichaelAtSamraksh commented 10 years ago

I bet the compiler output FPA instructions and the HeapBlock debug code called a floating point conversion routine that triggered the hard fault.

"bad linking or bad redefinition theory" When you say "bogus FP instruction," do you mean a library call, or an ARM ISA opcode? If library call, then this sounds like a conflict between what is passed to the linker as already-defined and what is defined in GCC header files. I think library call problem is that when we were using an older compiler, we were compiling with GCC 4.2 but linking to 4.4 or 4.5's libraries, so there were API mismatches that were defined in TinyCLR header files to allow linking to complete with the assumption these would never be called. Another example of this is TinyCLR including old RVDS libraries and linking against them. I have a vague recollection of running into the problem of a TinyCLR header file defining a missing FP pow or division function instead of calling the math library -- this shows up as an redefinition error when using the next CodeSourcery compiler version.

"bad cleaning theory and SymDef mystery" If you saw a bogus FP instruction when using -mfloat-abi=softfp on the DotNow, then there should have been software linkage to show what the FP library call was. I wonder if there is some combination of target options that did not use -mcpu=cortex-m3, or whether our "clean" build target is missing the removal of a library that would have been built with the Cortex-A8. I get a warning related to creating the ARM SymDefs when switching between compiling for SOC_ADAPT and EmoteDotNow, so perhaps the target that creates the SymDefs doesn't clean properly (although I am under the impression that the SymDefs target is related to compatibility with the ARM linker).

"note on presence of FP types in code" So HeapBlock + Debug = code called with type double if !defined(TINYCLR_EMULATED_FLOATING_POINT).
Defining TINYCLR_EMULATED_FLOATING_POINT means no CLR code should use type double (for tool chains that do not have a type "double"). In certain cases, this makes code simpler by using INT64 instead of double, including type casts where there aren't any math operations that would require proper fixed point configuration. When TINYCLR_EMULATED_FLOATING_POINT is not defined (as in our builds), then there do exist floating point types in TinyCLR code. Then the question becomes whether the compiler emits floating point operations or the linker connects libraries to softfp calls.

"precedence of compiler flags and default fpu" I recently noticed that the placement of our "-mcpu=Cortex-M3" GCC flag changed. The docs say -mcpu and -march flags make the right-most definition take higher precedence. I don't know about precedence between mcpu and other flags on the command line, although my guess is that other flags override mcpu flags. Look at the output of "arm-none-eabi-gcc -mcpu=cortex-m3 -Q --help=target":

The following options are target specific:
  -mabi=                                aapcs
  -mabort-on-noreturn                   [disabled]
  -mapcs                                [disabled]
  -mapcs-float                          [disabled]
  -mapcs-frame                          [disabled]
  -mapcs-reentrant                      [disabled]
  -mapcs-stack-check                    [disabled]
  -march=                               armv2
  -marm                                 [enabled]
  -mbig-endian                          [disabled]
  -mcallee-super-interworking           [disabled]
  -mcaller-super-interworking           [disabled]
  -mcirrus-fix-invalid-insns            [disabled]
  -mcpu=                                cortex-m3
  -mfix-cortex-m3-ldrd                  [enabled]
  -mfloat-abi=                          soft
  -mfp16-format=                        none
  -mfp=2
  -mfp=3
  -mfpe                                 [disabled]
  -mfpe=2
  -mfpe=3
  -mfpu=                                fpa
  -mhard-float
  -mlittle-endian                       [enabled]
  -mlong-calls                          [disabled]
  -mpic-register=
  -mpoke-function-name                  [disabled]
  -msched-prolog                        [enabled]
  -msingle-pic-base                     [disabled]
  -msoft-float
  -mstructure-size-boundary=            0x20
  -mthumb                               [disabled]
  -mthumb-interwork                     [enabled]
  -mtls-dialect=                        gnu
  -mtp=                                 auto
  -mtpcs-frame                          [disabled]
  -mtpcs-leaf-frame                     [disabled]
  -mtune=                               [default]
  -munaligned-access                    [enabled]
  -mvectorize-with-neon-double          [disabled]
  -mvectorize-with-neon-quad            [enabled]
  -mword-relocations                    [disabled]
  -mwords-little-endian                 [disabled]

  Known ARM ABIs (for use with the -mabi= option):
    aapcs aapcs-linux apcs-gnu atpcs iwmmxt

  Known ARM architectures (for use with the -march= option):
    armv2 armv2a armv3 armv3m armv4 armv4t armv5 armv5e armv5t armv5te armv6
    armv6-m armv6j armv6k armv6s-m armv6t2 armv6z armv6zk armv7 armv7-a armv7-m
    armv7-r armv7e-m ep9312 iwmmxt iwmmxt2 native

  Known __fp16 formats (for use with the -mfp16-format= option):
    alternative ieee none

  Known ARM FPUs (for use with the -mfpu= option):
    fpa fpe2 fpe3 fpv4-sp-d16 maverick neon neon-fp16 neon-vfpv4 vfp vfp3 vfpv3
    vfpv3-d16 vfpv3-d16-fp16 vfpv3-fp16 vfpv3xd vfpv3xd-fp16 vfpv4 vfpv4-d16

  Valid arguments to -mtp=:
    auto cp15 soft

  Known floating-point ABIs (for use with the -mfloat-abi= option):
    hard soft softfp

  Known ARM CPUs (for use with the -mcpu= and -mtune= options):
    arm1020e arm1020t arm1022e arm1026ej-s arm10e arm10tdmi arm1136j-s
    arm1136jf-s arm1156t2-s arm1156t2f-s arm1176jz-s arm1176jzf-s arm2 arm250
    arm3 arm6 arm60 arm600 arm610 arm620 arm7 arm70 arm700 arm700i arm710
    arm7100 arm710c arm710t arm720 arm720t arm740t arm7500 arm7500fe arm7d
    arm7di arm7dm arm7dmi arm7m arm7tdmi arm7tdmi-s arm8 arm810 arm9 arm920
    arm920t arm922t arm926ej-s arm940t arm946e-s arm966e-s arm968e-s arm9e
    arm9tdmi cortex-a15 cortex-a5 cortex-a7 cortex-a8 cortex-a9 cortex-m0
    cortex-m1 cortex-m3 cortex-m4 cortex-r4 cortex-r4f cortex-r5 ep9312 fa526
    fa606te fa626 fa626te fa726te fmp626 generic-armv7-a iwmmxt iwmmxt2 mpcore
    mpcorenovfp native strongarm strongarm110 strongarm1100 strongarm1110 xscale

  TLS dialect to use:
    gnu gnu2

Note -mfpu=fpa, so it would be nice to check the instruction Nathan encountered to see if it was an FPA opcode that would normally be implemented (by the OS?) in an illegal instruction fault handler, assuming the -mfloat-abi=softfp overrode the -mcpu=Cortex-M3's -mfloat-abi=soft but left the -mfpu=fpa. Likewise, it would be nice to know if the -mcpu=Cortex-M3 could override earlier flags if placed to the right of the conflicting flags (doubt it but could also be related to transient nature of this issue).

Nathan-Stohs commented 10 years ago
  1. I meant a bad opcode. Not a library issue. It was generating invalid opcodes such as VPUSH and VPOP (verified in the asm dump), but there may have been others, I only eyeballed it. Wikipedia has a handy table of the valid thumb2 instructions. https://en.wikipedia.org/wiki/ARM_Cortex-M
  2. Interesting that you get a warning, but I think its unrelated. But ARM and THUMB2 builds go in different directories with nothing in common, or so I thought? I usually do cleanbuild anyway because I just assume git would confuse it otherwise, I don't recall any problems or warnings but its hard to catch anything that isn't red.
  3. Sort of addressed above.
  4. I could not verify this quickly, but my recollection was that mcpu does not imply mfpu or the float-abi. But I could very well be wrong. But in any case I think this would only apply if we had a library issue, which we hadn't seen yet (but may be present).

Lets not worry about the library issue unless we actually encounter it, so far we have not, but we might. Exactly how the bad opcodes were included in the first place is perhaps not so important either, except for the question of if this whole thing is a red herring. This could simply be a case of garbage in -- garbage out.

The real question is if a normal build is trying to do soft calls. I will trace an execution and find out.

MichaelAtSamraksh commented 10 years ago

Below is the actual g++ configuration output for EmoteDotNow's target.

  1. VPUSH and VPOP may be indicative of FP instructions. Note -marm is disabled but -mthumb-interwork is enabled, so FP may be allowed anyway with the interwork option.
  2. mcpu does not need to imply mfpu. My point is that -mfpu is set by default to fpa in gcc and g++. So GCC has a default configuration for fpu, even if the flag is not specified. So if you used -mfloat-abi=softfp, then the toolchain may have used FPA instructions. (Although I think FPA is only compatible with ATPCS, not AAPCS. And VFP should be the default with recent arm architectures.)

-mthumb enabled with -marm disabled just means the compiler will only output thumb instructions. But the -mthumb-interwork means the thumb code may call ARM library functions. Although THUMB cannot access VFP registers, the -mfloat-abi=softfp links against ARM library functions that do call FP instructions.

Also note that we're linking with --no-warn-mismatch which I recommend we remove as this might have caught the issue. I also recommend we remove the -mthumb-interwork for Cortex-M3 and verify the assembler is provided a proper cpu configuration. -mthumb-interwork should be meaningless for architecture arm-v7m, so there's no reason to enable it and invite trouble on the chance that GCC doesn't enforce armv7-m addressing in presence of -mthumb-interwork.

 "c:\CodeSourcery-2013-05\bin\arm-none-eabi-g++.exe" -xc++ -Wno-invalid-offsetof -fcheck-new  -fno-threadsafe-statics    -ffunction-sections -fdollars-in-identifiers -fno-exceptions -march=armv7-m -mcpu=cortex-m3 -funsigned-char -mstructure-size-boundary=8  -DTINYCLR_ENABLE_SOURCELEVELDEBUGGING  -mthumb -DCOMPILE_THUMB2 -Wno-psabi -DVERSION_MAJOR=4 -DVERSION_MINOR=3 -DVERSION_BUILD=0 -DVERSION_REVISION=0 -DOEMSYSTEMINFOSTRING="\"Samraksh Copyright (C) 2014 Samraksh and Microsoft.\"" -DPLATFORM_ARM_STM32F10x -DPLATFORM_ARM_EmoteDotNow -DTARGETLOCATION_FLASH -DLITTLE_ENDIAN -DGCC      -mfloat-abi=soft    -mlittle-endian  -g3 -O0 -DSAM_FORCE_GCC_CMDLINE_OPTS -DDEBUG  -D_DEBUG   -Ic:\repo\DotNet-MF\MicroFrameworkPK_v4_3\CLR\Core\HeapPersistence -Ic:\repo\DotNet-MF\MicroFrameworkPK_v4_3\DeviceCode\include -Ic:\repo\DotNet-MF\MicroFrameworkPK_v4_3\DeviceCode\Targets\Native\
STM32F10x\DeviceCode\cmsis -Q --help=target
cc1.exe: warning: command line option '-Wno-invalid-offsetof' is valid for C++/ObjC++ but not for C [enabled by default]
cc1.exe: warning: command line option '-fcheck-new' is valid for C++/ObjC++ but not for C [enabled by default]
cc1.exe: warning: command line option '-fno-threadsafe-statics' is valid for C++/ObjC++ but not for C [enabled by default]
The following options are target specific:
  -mabi=                                aapcs
  -mabort-on-noreturn                   [disabled]
  -mapcs                                [disabled]
  -mapcs-float                          [disabled]
  -mapcs-frame                          [disabled]
  -mapcs-reentrant                      [disabled]
  -mapcs-stack-check                    [disabled]
  -march=                               armv7-m
  -marm                                 [disabled]
  -mbig-endian                          [disabled]
  -mcallee-super-interworking           [disabled]
  -mcaller-super-interworking           [disabled]
  -mcirrus-fix-invalid-insns            [disabled]
  -mcpu=                                cortex-m3
  -mfix-cortex-m3-ldrd                  [enabled]
  -mfloat-abi=                          soft
  -mfp16-format=                        none
  -mfp=2
  -mfp=3
  -mfpe                                 [disabled]
  -mfpe=2
  -mfpe=3
  -mfpu=                                fpa
  -mhard-float
  -mlittle-endian                       [enabled]
  -mlong-calls                          [disabled]
  -mpic-register=
  -mpoke-function-name                  [disabled]
  -msched-prolog                        [enabled]
  -msingle-pic-base                     [disabled]
  -msoft-float
  -mstructure-size-boundary=            0x8
  -mthumb                               [enabled]
  -mthumb-interwork                     [enabled]
  -mtls-dialect=                        gnu
  -mtp=                                 auto
  -mtpcs-frame                          [disabled]
  -mtpcs-leaf-frame                     [disabled]
  -mtune=                               [default]
  -munaligned-access                    [enabled]
  -mvectorize-with-neon-double          [disabled]
  -mvectorize-with-neon-quad            [enabled]
  -mword-relocations                    [disabled]
  -mwords-little-endian                 [disabled]

  Known ARM ABIs (for use with the -mabi= option):
    aapcs aapcs-linux apcs-gnu atpcs iwmmxt

  Known ARM architectures (for use with the -march= option):
    armv2 armv2a armv3 armv3m armv4 armv4t armv5 armv5e armv5t armv5te armv6
    armv6-m armv6j armv6k armv6s-m armv6t2 armv6z armv6zk armv7 armv7-a armv7-m
    armv7-r armv7e-m ep9312 iwmmxt iwmmxt2 native

  Known __fp16 formats (for use with the -mfp16-format= option):
    alternative ieee none

  Known ARM FPUs (for use with the -mfpu= option):
    fpa fpe2 fpe3 fpv4-sp-d16 maverick neon neon-fp16 neon-vfpv4 vfp vfp3 vfpv3
    vfpv3-d16 vfpv3-d16-fp16 vfpv3-fp16 vfpv3xd vfpv3xd-fp16 vfpv4 vfpv4-d16

  Valid arguments to -mtp=:
    auto cp15 soft

  Known floating-point ABIs (for use with the -mfloat-abi= option):
    hard soft softfp

  Known ARM CPUs (for use with the -mcpu= and -mtune= options):
    arm1020e arm1020t arm1022e arm1026ej-s arm10e arm10tdmi arm1136j-s
    arm1136jf-s arm1156t2-s arm1156t2f-s arm1176jz-s arm1176jzf-s arm2 arm250
    arm3 arm6 arm60 arm600 arm610 arm620 arm7 arm70 arm700 arm700i arm710
    arm7100 arm710c arm710t arm720 arm720t arm740t arm7500 arm7500fe arm7d
    arm7di arm7dm arm7dmi arm7m arm7tdmi arm7tdmi-s arm8 arm810 arm9 arm920
    arm920t arm922t arm926ej-s arm940t arm946e-s arm966e-s arm968e-s arm9e
    arm9tdmi cortex-a15 cortex-a5 cortex-a7 cortex-a8 cortex-a9 cortex-m0
    cortex-m1 cortex-m3 cortex-m4 cortex-r4 cortex-r4f cortex-r5 ep9312 fa526
    fa606te fa626 fa626te fa726te fmp626 generic-armv7-a iwmmxt iwmmxt2 mpcore
    mpcorenovfp native strongarm strongarm110 strongarm1100 strongarm1110 xscale

  TLS dialect to use:
    gnu gnu2
Nathan-Stohs commented 10 years ago

I performed an exhaustive trace using the "Hello World" C# app.

Result:

  1. No floating point library functions were called. Conclusions is this appears to be a non-issue and we can write this off as a myth. This would support my "garbage-in garbage-out" theory above. BUT...
  2. Although technically not a floating point function, 64-bit (integer) division was also included in my profiling and also must use software emulation. We are hitting that somewhat hard.

Specifically, we were hitting:

ldivmod uidiv uidivmod

Of these the first is the "real" offender, the others are acting as helper functions.

The MF architecture is to keep time as a global 64-bit variable. All of the 64-bit division operations I traced were due to computations related to time and occurred in the Time PAL driver. It turns out that all MF time operations will usually result in at least one, usually several 64-bit divisions. This happens in large part because MF time is normalized against a fictional 10 MHz clock and normalization involves 64-bit division. We hit this penalty even in a simple GetTime().

For example: return GetMachineTime() / TIMEUNIT_TO_MILLISECONDS; and time = (time * m_Ticks_b + m_Ticks_c) / m_Ticks_a; where time and GetMachineTime() are 64-bit.

As for how much this hurts:

  1. The measured time for a single 64-bit division in high-speed mode was 9.5 microseconds, which would scale to at most 57 microseconds in low-speed mode (it should be at least a little better in practice).
  2. How often this occurs will depend on the application somewhat, but even in Hello World this was happening several times a second (but was without sleep).

Conclusions:

  1. This is probably not a major source of performance loss for high-level C# code or involved in the performance penalty we see there.
  2. I would rate the frequency of the operation as being moderate-to-low. As such, in terms of power efficiency it would probably have a measurable impact in some circumstances but not large by any means.
  3. In terms of latency and real-time functions, especially in low-speed mode this is much more significant. The function does not lock so we don't pay the full amount in latency but it does potentially add to the critical path. Additionally, if you are trying to achieve microsecond precision, things can get very awkward when basic timekeeping operations are consuming a relatively large amount of time.

TODO: It probably does behoove us to look into this a little more and to consider if we can do anything to reduce the impact of the situation. I will do some thinking on this. In the meantime leaving this bug open for comment.

Nathan-Stohs commented 10 years ago

After some thought, I don't see a good solution that doesn't involve entirely re-writing the way MF handles time. Doing so would be very tricky since Time is not just an under-the-hood concept and any changes would have implications all the way up the foodchain to the .NET interface specs and C# user code. Although this problem is an annoyance, IMO it does not rise to the level of making such a re-write a reasonable recourse.

At the PAL layer and above this issue is for now unavoidable.

At the HAL layer, driver writers should only work with time in native ticks. This means driver writers should use HAL_Time* functions such as HAL_Time_CurrentTicks() and NOT use any of the PAL layer Time* fucnctions such as Time_GetMachineTime(). We sort of already knew this anyway and nobody is running afoul of this to my knowledge.

I am closing the bug. The issue has been investigated and I don't anticipate further movement.