Wasted-Audio / hvcc

The heavy hvcc compiler for Pure Data patches. Updated to python3 and additional generators
https://wasted-audio.github.io/hvcc/
GNU General Public License v3.0
260 stars 30 forks source link

feature request: no-std flag #215

Open marchingband opened 2 weeks ago

marchingband commented 2 weeks ago

i am using hvcc to compile for a target which has C++ but no standard library, it is Circle https://github.com/rsta2/circle. This means I need to do some modifications to the files that hvcc generates, removing <new>, <atomic>, etc..

It also has ARM architecture, but fails to find dmb asm, possibly because RPI-zero-1 it is ARM6? (not sure on this)

I am working on ways to work around these issues, so that I don't have to manually edit these files following every compilation, but I am curious if there is the possibility of a feature here, maybe a --nostd flag or config option.

If so, I'd be happy to go into more detail. Thanks!

dromer commented 2 weeks ago

Hey, thnx for the request. Can you give an example of the kind of modification that seems to be needed?

It's odd to me that these features are not available. What I understand is that circle provides a baremetal target, however we are also able to build for microcontrollers and I'd think that this should not be much different. (note that I don't have any experience with baremetal on raspberry pi)

Also what is "dmb asm"?

marchingband commented 2 weeks ago

I am not sure why circle opted to not include std lib in their toolchain. I also am not sure how it is provided in the Daisy toolchain, which I see works with hvcc.

One modification I need to make is, in Heavy_heavy.cpp I delete these lines:

#include <new>
extern "C" {
  HV_EXPORT HeavyContextInterface *hv_heavy_new(double sampleRate) {
    // allocate aligned memory
    void *ptr = hv_malloc(sizeof(Heavy_heavy));
    // ensure non-null
    if (!ptr) return nullptr;
    // call constructor
    new(ptr) Heavy_heavy(sampleRate);
    return Context(ptr);
  }

  HV_EXPORT HeavyContextInterface *hv_heavy_new_with_options(double sampleRate,
      int poolKb, int inQueueKb, int outQueueKb) {
    // allocate aligned memory
    void *ptr = hv_malloc(sizeof(Heavy_heavy));
    // ensure non-null
    if (!ptr) return nullptr;
    // call constructor
    new(ptr) Heavy_heavy(sampleRate, poolKb, inQueueKb, outQueueKb);
    return Context(ptr);
  }

  HV_EXPORT void hv_heavy_free(HeavyContextInterface *instance) {
    // call destructor
    Context(instance)->~Heavy_heavy();
    // free memory
    hv_free(instance);
  }
} // extern "C"

in HvLightPipe.c hvcc is looking to determine what fence instructions to use, and the intrinsic __dmb(0xE) gets picked under Circle, and the build fails, unable to find it. I did a little research and it looks like this is an ARM7 intrinsic, but RPI-zero-1 is Arm6?

I am far from an expert on any of this.

marchingband commented 2 weeks ago

digging deeper into this today, Circle provides new.h and atomic.h implementations. Unfortunately hvcc expects c++ std lib style imports like #import <new> and there is no way to use the preprocessor or other means to work with this. If it instead used import "new.h" then I could work around it. No idea if this is getting out of scope, sorry for the messy feature request. Bit of a newb in this department.

marchingband commented 2 weeks ago

for the dmb stuff, in HvLightPipe.c, hvcc produces:

#if __SSE__ || HV_SIMD_SSE
#include <xmmintrin.h>
#define hv_sfence() _mm_sfence()
#elif __arm__ || HV_SIMD_NEON
  #if __ARM_ACLE
    #include <arm_acle.h>
    // https://msdn.microsoft.com/en-us/library/hh875058.aspx#BarrierRestrictions
    // http://doxygen.reactos.org/d8/d47/armintr_8h_a02be7ec76ca51842bc90d9b466b54752.html
    #define hv_sfence() __dmb(0xE) /* _ARM_BARRIER_ST */
  #elif defined(__GNUC__)
    #define hv_sfence() __asm__ volatile ("dmb 0xE":::"memory")
  #else
    // http://stackoverflow.com/questions/19965076/gcc-memory-barrier-sync-synchronize-vs-asm-volatile-memory
    #define hv_sfence() __sync_synchronize()
  #endif
#elif HV_WIN
// https://msdn.microsoft.com/en-us/library/windows/desktop/ms684208(v=vs.85).aspx
#define hv_sfence() _WriteBarrier()
#else
#define hv_sfence() __asm__ volatile("" : : : "memory")
#endif

In the Circle toolchain, __arm__ is defined, as is __GNUC__ so we get #define hv_sfence() __asm__ volatile ("dmb 0xE":::"memory") but the assembler complains with Error: selected processor does not support 'dmb 0xE' in ARM mode
So I must remove all this, leaving the default #define hv_sfence() __asm__ volatile("" : : : "memory")

marchingband commented 2 weeks ago

in HvUtils.h hvcc produces:

#if HV_WIN
  #include <windows.h>
  #define hv_atomic_bool volatile LONG
  #define HV_SPINLOCK_ACQUIRE(_x) while (InterlockedCompareExchange(&_x, true, false)) { }
  #define HV_SPINLOCK_TRY(_x) return !InterlockedCompareExchange(&_x, true, false)
  #define HV_SPINLOCK_RELEASE(_x) (_x = false)
#elif HV_ANDROID
  // Android support for atomics isn't that great, we'll do it manually
  // https://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html
  #define hv_atomic_bool hv_uint8_t
  #define HV_SPINLOCK_ACQUIRE(_x) while (__sync_lock_test_and_set(&_x, 1))
  #define HV_SPINLOCK_TRY(_x) return !__sync_lock_test_and_set(&_x, 1)
  #define HV_SPINLOCK_RELEASE(_x) __sync_lock_release(&_x)
#elif __cplusplus
  #include <atomic>
  #define hv_atomic_bool std::atomic_flag
  #define HV_SPINLOCK_ACQUIRE(_x) while (_x.test_and_set(std::memory_order_acquire))
  #define HV_SPINLOCK_TRY(_x) return !_x.test_and_set(std::memory_order_acquire)
  #define HV_SPINLOCK_RELEASE(_x) _x.clear(std::memory_order_release)
#elif defined(__has_include)
  #if __has_include(<stdatomic.h>)
    #include <stdatomic.h>
    #define hv_atomic_bool atomic_flag
    #define HV_SPINLOCK_ACQUIRE(_x) while (atomic_flag_test_and_set_explicit(&_x, memory_order_acquire))
    #define HV_SPINLOCK_TRY(_x) return !atomic_flag_test_and_set_explicit(&_x, memory_order_acquire)
    #define HV_SPINLOCK_RELEASE(_x) atomic_flag_clear_explicit(memory_order_release)
  #endif
#endif
#ifndef hv_atomic_bool
  #define hv_atomic_bool volatile bool
  #define HV_SPINLOCK_ACQUIRE(_x) \
  while (_x) {} \
  _x = true;
  #define HV_SPINLOCK_TRY(_x) \
  if (!_x) { \
    _x = true; \
    return true; \
  } else return false;
  #define HV_SPINLOCK_RELEASE(_x) (_x = false)
#endif

__cplusplus is defined, but <atomic> does not exist. Circle provides an implementation of atomic, but it does not include the type atomic_flag.
So I need to remove all of this, just leaving the default:

#ifndef hv_atomic_bool
  #define hv_atomic_bool volatile bool
  #define HV_SPINLOCK_ACQUIRE(_x) \
  while (_x) {} \
  _x = true;
  #define HV_SPINLOCK_TRY(_x) \
  if (!_x) { \
    _x = true; \
    return true; \
  } else return false;
  #define HV_SPINLOCK_RELEASE(_x) (_x = false)
#endif
dromer commented 2 weeks ago

You can also link straight to the source-code, rather than copy/pasting it into a post :)

How are other projects dealing with these limitations in Circle?

marchingband commented 2 weeks ago

I found a fork that adds standard library support to circle, but its very hard to build, and I'm not sure it will work on macos. Many embedded targets, especially mcus, will not have c++ standard library, some will not have a c++ compiler at all, so it a question of if you want to improve hvcc for embedded targets. I could see ESP32 being a good target for example, or some of the new RISC-V mcus, some of which have builtin audio codecs. For the ARM targets, I don't know enough to be certain, but I suspect this is a bug in hvcc, there are many ARM instruction sets, so you can't assume certain instructions are implemented, or certain intrinsics are available, and also expect it to port well to other ARM targets. Again just my naïve suspicion.

dromer commented 2 weeks ago

It's already reported to work on ESP8266/32, RP2040, and Teensy.

You can disable all ARM/NEON optimizations by compiling with HV_SIMD_NONE.

marchingband commented 2 weeks ago

Amazing! Thanks!