jabber-tools / cognitive-services-speech-sdk-rs

Apache License 2.0
24 stars 15 forks source link

Allow consumers to not automatically download the SDK #17

Closed Skepfyr closed 4 months ago

Skepfyr commented 5 months ago

We have a security requirement to only source dependencies in a secure way, downloading from an arbitrary URL in the build.rs doesn't satisfy that, so I'm adding the ability to provide your own copy of the SDK.

This change ended up being relatively large so that I could be more sure that I'd made the same change for all 3 target OSs (hopefully this makes it more maintainable in the future). I only need to make it so that we aren't required to download but I ended up making these behavioural changes:

  1. The automatic download is disabled if MS_COG_SVC_SPEECH_SDK_DIR is set.
    • I would actually prefer that you have to explicitly ask it to do the automatic download, as I'd like to prevent accidental downloads. I think this might even be acceptable as realistically anyone who distributes this needs to obtain the SDK library separately. What do you think?
    • I'm doing nothing to check that the version of the SDK at MS_COG_SVC_SPEECH_SDK_DIR is the correct version, I could do something like check the library against a hash (and warn if it's wrong) but I'm not sure whether that's a good idea.
  2. I renamed the RENEW_SDK environment variable to MS_COG_SVC_SPEECH_RENEW_SDK and now it only needs to be set to something, not necessarily "1".
  3. MS_COG_SVC_SPEECH_SKIP_BINDGEN now only needs to be set, not necessarily to "1".
  4. bindgen now writes the bindings to OUT_DIR rather than src/ffi/bindings.rs (unless MS_COG_SVC_SPEECH_UPDATE_BINDINGS is set), and the ffi module will use that unless bindgen is skipped for any reason.
  5. build.rs does nothing when being built by docs.rs so that https://docs.rs/cognitive-services-speech-sdk-rs starts working (as documented in https://docs.rs/about/builds#detecting-docsrs)
adambezecny commented 5 months ago

let's handle this one once we are done with https://github.com/jabber-tools/cognitive-services-speech-sdk-rs/pull/15. :)

adambezecny commented 5 months ago

hi Jack,

version 1.0.0 with your previous changes is out, thanks for that! I'll start looking into this one now. Can you rebase and remove conflicts please?

thank you

Adam

adambezecny commented 5 months ago

on my mac getting this:


   Compiling cognitive-services-speech-sdk-rs v1.0.0 (/Users/adambezecny/rustprojects/cognitive-services-speech-sdk-rs-PR/cognitive-services-speech-sdk-rs)
error[E0433]: failed to resolve: use of undeclared type `Command`
  --> build.rs:89:13
   |
89 |             Command::new("gcc")
   |             ^^^^^^^ use of undeclared type `Command`
   |
help: consider importing this struct
   |
1  + use std::process::Command;
   |

For more information about this error, try `rustc --explain E0433`.
error: could not compile `cognitive-services-speech-sdk-rs` (build script) due to 1 previous error
warning: build failed, waiting for other jobs to finish...
➜  cognitive-services-speech-sdk-rs git:(local-library)
Skepfyr commented 5 months ago

Whoops, thanks for testing on Mac (it was also broken on ARM Linux), I've fixed that error now.

adambezecny commented 5 months ago

I just looked at the new build script end-to-end (maybe for the first time :-) ) and this is getting just too complex. There are too many variables controlling the flow:

For new user it will be real hell to decide which of these to set. It is kind of confusing even for me looking into build script right now.

I am still fighting to understand why we need to keep src/ffi/bindings. Is this for docrs only (since we actually skip whole build.rs when DOCS_RS is set)? I would really like to get rid of this file and use generated files in target instead if possible (if this imposes some limitations let's discuss them and see if it is still acceptable or not).

Skepfyr commented 5 months ago

That's a reasonable reaction, I think we can get rid of a lot of them. Although in general I think the vast majority of users don't need to set any of them and it will just work.

If I set MS_COG_SVC_SPEECH_RENEW_SDK, then bindings are updated. why do we need then MS_COG_SVC_SPEECH_UPDATE_BINDINGS?

If you set MS_COG_SVC_SPEECH_RENEW_SDK then it re-downloads the SDK, setting this has no impact on whether the bindings are re-generated. MS_COG_SVC_SPEECH_UPDATE_BINDINGS should never be set by consumers of this library, it updates src/ffi/bindings.rs, which should only be done by maintainers of this library when you want to update the Speech SDK version.

By default build process wil download sdk and regenerate new bindings, unless I set MS_COG_SVC_SPEECH_SKIP_BINDGEN. In this regard I don't even need MS_COG_SVC_SPEECH_RENEW_SDK, do I?

The build process will always download the SDK, even if you set MS_COG_SVC_SPEECH_SKIP_BINDGEN. However, I think we can delete MS_COG_SVC_SPEECH_RENEW_SDK because Cargo's caching means that it won't redownload the SDK unless something changes, so I think it's correct to always download the SDK in this build script.

It would be great to have just single include var (as opposed to separate MS_COG_SVC_SPEECH_LIB_DIR+MS_COG_SVC_SPEECH_INCLUDE_DIR), can we address all differences in source folder structure (per os) internally without asking the user to set two versy special variables? user should just point it to top level folder. We should provide/document in readme expected folder structure (per OS) to make it easier for user to understand how the layout should look like.

I would like to do this too, however I can't work out a good way to do it, the biggest issue I have is that I need to support using the NuGet package on Linux too, it contains all the information required to build on Linux (the headers + .so) and it's the only format I can source "securely". I guess I could change it so that Linux uses the NuGet pacakge as its source, but that feels a bit weird. The other option would be to write some code to try and work out what format the SDK is in and guess where the files will be.

I am still fighting to understand why we need to keep src/ffi/bindings. Is this for docrs only (since we actually skip whole build.rs when DOCS_RS is set)? I would really like to get rid of this file and use generated files in target instead if possible (if this imposes some limitations let's discuss them and see if it is still acceptable or not).

Precisely, I think the only reason to keep it now is docsrs support, which is a bit sad, but it is important because not supporting docsrs means that no-one who depends on this will have docrs support either.

adambezecny commented 5 months ago
Skepfyr commented 5 months ago

The whole reason I'm making this PR is so that I can use this library in my environment, as far as I know no-one else need this, so I want to make sure that this is at least flexible enough to support what I need. I think that either means having MS_COG_SVC_SPEECH_INCLUDE_DIR and MS_COG_SVC_SPEECH_LIB_DIR or to write code that attempt to detect what type of SDK it's been pointed at in such a way as Linux supports both the nupkg and tar.gz formats.

One more thought regarding binding generating. We agreed we need to have pre-generated bindings for all platforms (due to docsrs). Then it maybe does not make sense to generate bindings during library build by ordinary library users. Maybe we could leave binding generation to library maintainer only (via MS_COG_SVC_SPEECH_UPDATE_BINDINGS). Why do we need to actually generate it every time (I mean header files are always same regardless of architecture of given OS, things like x64,x86/arm does not make difference right?)? Any thoughts here? If we agreed this is not necessary then we can even remove MS_COG_SVC_SPEECH_SKIP_BINDGEN. What do you think?

The issue with this is #ifdef, it's not even clear that the different platforms are homogenous enough (I'm pretty sure I've seen my generated bindings on Linux differ from the checked in ones in tiny ways.) bindgen recommends building the bindings like we do here, but as far as I can tell a lot of projects don't for exactly this reason. I suspect the different platforms are homogenous enough that we can get away with binding_ but I'm not entirely sure.

adambezecny commented 5 months ago

ok, so let's keep MS_COG_SVC_SPEECH_SKIP_BINDGEN variable and by default let's generate bindings. As discussed we should have 3 pre-built bindings for win/linux/mac.

now the hard stuff :)

I need to understand why in your environment you cannot do something like this:

manually or in any similar way have sourced folder with SpeechSDK, e.g. Linux one:

C:\tmp\SpeechSDKAllOS\a\linux>tree
Folder PATH listing for volume OS
Volume serial number is 682F-6519
C:.
└───SpeechSDK-Linux-1.37.0
    ├───include
    │   ├───cxx_api
    │   └───c_api
    └───lib
        ├───arm32
        ├───arm64
        ├───centos7-x64
        ├───x64
        └───x86

This is basically unpacked SpeechSDK-Linux-1.37.0.tar.gz you will get from https://csspeechstorage.blob.core.windows.net/drop/1.37.0/SpeechSDK-Linux-1.37.0.tar.gz (i.e. the link we are using within build.rs to download the SpeechSDK)

Then you would set some env var (e.g. MS_COG_SVC_SPEECH_SDK_DIR) to value C:\tmp\SpeechSDKAllOS\a\linux\SpeechSDK-Linux-1.37.0 (as per listing above).

LIBDIR will be determined within build.rs as **$MS_COG_SVC_SPEECH_SDKDIR\lib/[target arch], e.g. C:\tmp\SpeechSDKAllOS\a\linux\SpeechSDK-Linux-1.37.0\lib\x64**

INCLUDEDIR will will be determined within build.rs as **$MS_COG_SVC_SPEECH_SDK_DIR\include\capi i.e. _C:\tmp\SpeechSDKAllOS\a\linux\SpeechSDK-Linux-1.37.0\include\capi**

Similar we can do for all supported OS (and all architectures), listings of unpacked archives included at the bottom. This would be most elegant way. Is there any reason why this is not good enough for you? You said you have nupkg sourced at your system. Does it have different structure? Please clarify.

If we will conclude you really cannot use approach with MS_COG_SVC_SPEECH_SDK_DIRthen I suggest following approach:

we will have 3 vars:

I believe most users will simply download SDK manually and point MS_COG_SVC_SPEECH_SDK_DIRto extracted archive. Only in very special cases (like yours) users will be using MS_COG_SVC_SPEECH_SDK_LIB_DIR+MS_COG_SVC_SPEECH_INCLUDE_DIRinstead. Would this work for you?

That said if it is possible for you to live with MS_COG_SVC_SPEECH_SDK_DIR only, I would strongly prefer keeping it simple and having only MS_COG_SVC_SPEECH_SDK_DIRvariable.

macos listing

C:\tmp\SpeechSDKAllOS\a\macos>tree
Folder PATH listing for volume OS
Volume serial number is 682F-6519
C:.
└───MicrosoftCognitiveServicesSpeech.xcframework
    └───macos-arm64_x86_64
        └───MicrosoftCognitiveServicesSpeech.framework
            └───Versions
                └───A
                    ├───Headers
                    ├───Modules
                    ├───Resources
                    └───_CodeSignature

windows listing:

C:\tmp\SpeechSDKAllOS\a\windows>tree
Folder PATH listing for volume OS
Volume serial number is 682F-6519
C:.
├───build
│   ├───monoandroid
│   │   └───libs
│   │       ├───arm64-v8a
│   │       ├───armeabi-v7a
│   │       ├───x86
│   │       └───x86_64
│   ├───native
│   │   ├───ARM
│   │   │   └───Release
│   │   ├───ARM64
│   │   │   └───Release
│   │   ├───include
│   │   │   ├───cxx_api
│   │   │   └───c_api
│   │   ├───uap
│   │   │   ├───ARM
│   │   │   │   └───Release
│   │   │   ├───ARM64
│   │   │   │   └───Release
│   │   │   ├───Win32
│   │   │   │   └───Release
│   │   │   └───x64
│   │   │       └───Release
│   │   ├───Win32
│   │   │   └───Release
│   │   └───x64
│   │       └───Release
│   ├───net462
│   ├───net6.0-android
│   ├───net6.0-ios
│   └───Xamarin.iOS
├───lib
│   ├───monoandroid
│   ├───net462
│   ├───net6.0-android
│   ├───net6.0-ios
│   ├───net6.0-maccatalyst
│   ├───netstandard2.0
│   ├───uap10.0
│   └───Xamarin.iOS
├───package
│   └───services
│       └───metadata
│           └───core-properties
├───runtimes
│   ├───android-arm
│   │   └───native
│   ├───android-arm64
│   │   └───native
│   ├───android-x64
│   │   └───native
│   ├───android-x86
│   │   └───native
│   ├───centos7-x64
│   │   └───native
│   ├───ios-arm64
│   │   └───native
│   ├───iossimulator-arm64
│   │   └───native
│   ├───iossimulator-x64
│   │   └───native
│   ├───linux-arm
│   │   ├───lib
│   │   │   └───netstandard2.0
│   │   └───native
│   ├───linux-arm64
│   │   ├───lib
│   │   │   └───netstandard2.0
│   │   └───native
│   ├───linux-x64
│   │   ├───lib
│   │   │   └───netstandard2.0
│   │   └───native
│   ├───maccatalyst-arm64
│   │   └───native
│   ├───maccatalyst-x64
│   │   └───native
│   ├───osx-arm64
│   │   ├───lib
│   │   │   └───netstandard2.0
│   │   └───native
│   ├───osx-x64
│   │   ├───lib
│   │   │   └───netstandard2.0
│   │   └───native
│   ├───win-arm
│   │   ├───native
│   │   └───nativeassets
│   │       └───uap
│   ├───win-arm64
│   │   ├───native
│   │   └───nativeassets
│   │       └───uap
│   ├───win-x64
│   │   ├───native
│   │   └───nativeassets
│   │       └───uap
│   └───win-x86
│       ├───native
│       └───nativeassets
│           └───uap
└───_rels
adambezecny commented 5 months ago

I published PR16 meaning there will be some minor conflict(s) to resolve. sorry :(

adambezecny commented 4 months ago

hi Jack,

long time no hear. I am just wondering if you saw my latest comment on this PR and if it makes sense. Are you going to finish this PR? Let me know please. I have updated github workflow for PRs to make contributing little bit easier, now it tries to compile and run som tests on all 3 supported platforms.

regards,

Adam

Skepfyr commented 4 months ago

Hi, sorry work got very busy over the last couple of weeks and I'm on holiday this week. I hope to take another look at this either next week or the week after.

Skepfyr commented 4 months ago

Hi, sorry about this but some fairly dramatic changes mean that we no longer need this and I no longer have any motivation to complete it. Feel free to use the code yourself if you want. Apologies and thanks for all the time you've spent reviewing this.