eclipse / omr

Eclipse OMR™ Cross platform components for building reliable, high performance language runtimes
http://www.eclipse.org/omr
Other
940 stars 394 forks source link

Migrate processor and feature detection from OpenJ9 to OMR #4339

Open fjeremic opened 4 years ago

fjeremic commented 4 years ago

Background

The CPU class in the compiler in OMR [1] currently has no way of determining the processor we are running on at runtime. The OMR::CPU class is an extensible class which is meant to be overriden by the target code generator. Inside of these extended codegen classes we have APIs to determine whether we support generating instructions for a particular processor. For example [2] is the override on Z and [3] is the override on Power.

Each code generator seems to use different, although quite similar APIs to determine if we can generate instructions for a particular processor.

As we can see we're somewhat inconsistent in the APIs we use. More over, if you notice the APIs we use across the different codegens will always default down to the minimum Architecture Level Set (ALS) defined in OMR. This means that irregardless of what processor we are actually running on we'll generate instructions for the least supported processor.

This is problematic in dynamic runtimes.

[1] https://github.com/eclipse/omr/blob/20db4fbc1ea570fd806783a97b16831f4f719eed/compiler/env/OMRCPU.cpp [2] https://github.com/eclipse/omr/blob/20db4fbc1ea570fd806783a97b16831f4f719eed/compiler/z/env/OMRCPU.cpp [3] https://github.com/eclipse/omr/blob/20db4fbc1ea570fd806783a97b16831f4f719eed/compiler/p/env/OMRCPU.cpp [4] https://github.com/eclipse/omr/blob/20db4fbc1ea570fd806783a97b16831f4f719eed/compiler/p/codegen/BinaryEvaluator.cpp#L2087-L2096 [5] https://github.com/eclipse/omr/blob/20db4fbc1ea570fd806783a97b16831f4f719eed/compiler/z/codegen/BinaryEvaluator.cpp#L3125-L3128 [6] https://github.com/eclipse/omr/blob/20db4fbc1ea570fd806783a97b16831f4f719eed/compiler/x/codegen/OMRCodeGenerator.cpp#L385-L388

Proposed Solution

Our friends at the OpenJ9 project have implemented processor detection in the port library already, and the JIT compiler in OpenJ9 is already making use of the data provided by the port library and caching the various answers for use in the JIT.

For example on Power and Z we use the j9sysinfo_processor_has_feature API [7] to query whether we have support for a particular CPU feature, which is initialized in the port library on initialization. Example of this is in the j9sysinfo.c file for both Power and Z [8].

Similarly processors are detected via the J9ProcessorDesc struct which is initialized via the j9sysinfo_get_processor_description API which can be found in the same j9sysinfo.c file. The way in which we initialize these structures depends on the OS and architecture type.

What we effectively need to do is to sink the OpenJ9 pieces of the processor and feature detection down into the OMR layer so that we can access it directly from the OMR compiler. The work here will involve investigating what pieces of the OpenJ9 port library are used and finding the correct place to move the APIs in OMR. Processor and feature detection is a delicate piece of OpenJ9 and is crucial for performance so such a change must be staged, that is to first copy the processor detection into OMR, make use of it in OMR, only once that is working we can work on deprecating the OpenJ9 piece to take advantage of the OMR support.

[7] https://github.com/eclipse/openj9/blob/82609efc1fa05a5d1af51ad6ceed7215376dde84/runtime/compiler/z/env/J9CPU.cpp#L230-L301 [8] https://github.com/eclipse/openj9/blob/82609efc1fa05a5d1af51ad6ceed7215376dde84/runtime/port/unix/j9sysinfo.c

fjeremic commented 4 years ago

@0xdaryl FYI. @AidanHa is this something you could tackle?

AidanHa commented 4 years ago

Sure, I can definitely look into this.

AidanHa commented 4 years ago

After reading through the issue and getting familiar with the code, I think I will divide this issue into 4 PRs. 1: Creation of processor and feature detection in OMR (from OpenJ9). 2: Implementation of the new feature throughout out OMR. 3: Implementation of the new feature throughout OpenJ9. 4: Deletion of the processor detection in OpenJ9 port library.

0xdaryl commented 4 years ago

For #2 and #3, by "implementation of the feature" do you mean "exploitation of the feature" ? (for lack of a better word).

AidanHa commented 4 years ago

@0xdaryl that is correct, I couldn't find a better way to phrase that point.

AidanHa commented 4 years ago

Update for those following: I got a working version of processor detection locally on my x86 machine. Will test other architectures, and submit a PR when all is ready!

AidanHa commented 4 years ago

I was wondering how should we be testing feature detection? In my opinion, Openj9 currently has a fairly "weak" test for feature detection (https://github.com/eclipse/openj9/blob/master/runtime/tests/port/si.c#L965-L971), and I would like to improve upon this if possible...

fjeremic commented 4 years ago

I was wondering how should we be testing feature detection? In my opinion, Openj9 currently has a fairly "weak" test for feature detection (eclipse/openj9:runtime/tests/port/si.c@master#L965-L971

On Linux we can likely query /proc/cpuinfo and parse the string and compare it against what we programmatically detect?

AidanHa commented 4 years ago

On Linux we can likely query /proc/cpuinfo and parse the string and compare it against what we programmatically detect?

This might have some issues where the user doesn't have permission to view the output of /proc/cpuinfo, and all that will show up is "permission denied".

IMO, it's very difficult to actually precisely verify if all the known features (for a specific processor) are set...

fjeremic commented 4 years ago

IMO, it's very difficult to actually precisely verify if all the known features (for a specific processor) are set...

I would agree. We'll just have to be very diligent when manually testing that this was done right. i.e. hop on a bunch of different machines and validate before/after. Perhaps using OpenJ9 may help here since processor detection is known to work there already.

AidanHa commented 4 years ago

I am currently on step 2 of resolving this issue: Implementing the port library function throughout OMR and unifying the different API calls for processor detection. One thing I noticed is that the current processor detection in compiler/env/OMRCPU.hpp (TR::Compiler>target.cpu.setProcessor() and TR::Compiler>target.cpu.id() is currently being used downstream in OpenJ9 (Eg: https://github.com/eclipse/openj9/blob/master/runtime/compiler/env/ProcessorDetection.cpp#L696), which makes it difficult to simply remove the current implementation and replace it with the port library version. To work my way around this issue, I've decided to go with the following:

1: In the CPU class in compiler/env/OMRCPU.hpp, add separate processor detection calls that utilize the new port library functions at https://github.com/eclipse/omr/pull/4503.

2: In OpenJ9, reconfigure all of the instances where the old processor detection functions were use to utilize the new port library functions.

3: Safely delete the old processor detection functions from OMR.

Let me know if this works!

AidanHa commented 4 years ago

For clarity: "old processor detection functions" = TR::Compiler>target.cpu.setProcessor(). "new processor detection functions" = the new port library functions.

fjeremic commented 4 years ago

Let me know if this works!

Sounds like a good plan.

harryyu1994 commented 4 years ago

Went through the above discussions:

Here's my understanding:

In OMR we have

env/OMRCPU
z/env/OMRCPU
p/env/OMRCPU
x/env/OMRCPU

In OpenJ9 we have

env/J9CPU
z/env/J9CPU
p/env/J9CPU
x/env/J9CPU

Last Week's Meeting:

@fjeremic @dsouzai Let me know if I understood everything correctly, I want to make sure everything is clear before I proceed to implementation. Thanks!

@mpirvu FYI, this is the processor feature detection work for OMR that I'll be working on for a while, I will likely log all my progress in this issue.

fjeremic commented 4 years ago

^ a question I have with that is in OpenJ9 we call initializeProcessorType() during the onLoadInternal() routine. Where should we do that in OMR?

The initialization should happen in the constructor of the OMRCPU class, i.e. calling the port library and caching the result if needed.

harryyu1994 commented 4 years ago

Had a high priority JITServer item last week, back to this work item now.

Some notes about TR::Compiler->target.cpu, how and where it's initialized

void
OMR::CPU::initializeByHostQuery()
   {

#ifdef OMR_ENV_LITTLE_ENDIAN
   _endianness = TR::endian_little;
#else
   _endianness = TR::endian_big;
#endif

   // Validate host endianness #define with an additional compile-time test
   //
   char SwapTest[2] = { 1, 0 };
   bool isHostLittleEndian = (*(short *)SwapTest == 1);

   if (_endianness == TR::endian_little)
      {
      TR_ASSERT(isHostLittleEndian, "expecting host to be little endian");
      }
   else
      {
      TR_ASSERT(!isHostLittleEndian, "expecting host to be big endian");
      }

#if defined(TR_HOST_X86)
   _majorArch = TR::arch_x86;
   #if defined (TR_HOST_64BIT)
   _minorArch = TR::m_arch_amd64;
   #else
   _minorArch = TR::m_arch_i386;
   #endif
#elif defined (TR_HOST_POWER)
   _majorArch = TR::arch_power;
#elif defined(TR_HOST_ARM)
   _majorArch = TR::arch_arm;
#elif defined(TR_HOST_S390)
   _majorArch = TR::arch_z;
#elif defined(TR_HOST_ARM64)
   _majorArch = TR::arch_arm64;
#elif defined(TR_HOST_RISCV)
   _majorArch = TR::arch_riscv;
   #if defined (TR_HOST_64BIT)
   _minorArch = TR::m_arch_riscv64;
   #else
   _minorArch = TR::m_arch_riscv32;
   #endif
#else
   TR_ASSERT(0, "unknown host architecture");
   _majorArch = TR::arch_unknown;
#endif

   }
fjeremic commented 4 years ago

Let's be consistent with other platforms. If an object can be fully initialized at the allocation point we should initialize the object in the Constructor and simplify the callers by not having to call an additional API which they may or may not be aware of.

harryyu1994 commented 4 years ago

Summary of the Processor Detection OMR Port Library Additions

harryyu1994 commented 4 years ago

How OpenJ9 initializes "TR::Compiler->target.cpu" for x86

harryyu1994 commented 4 years ago
  • when we do self()->queryX86TargetCPUID(), which one are we actually invoking? the one closer to TR::CPU?

This question I can answer it myself. self() returns a pointer to the most derived/specialized class, which means we will always invoke the functions "closer" to the most specialized class. Dynamic polymorphism: we have a base pointer, then rely on the virtual keyword to find the most specialized function. Static polymorphism (what's used here): we will always get the most specialized pointer through self()

fjeremic commented 4 years ago

can we replace this with the OpenJ9 version? @fjeremic

Yes we can. Ideally as you mentioned earlier where we want to arrive at the end is for this processor detection to be migrated into the port library (sysinfo_get_processor_description) and x86 would query that, similarly to other platforms.

harryyu1994 commented 4 years ago

jitGetCPUID() vs omrsysinfo_get_x86_description()

harryyu1994 commented 4 years ago

OMR's cpuid vs. OpenJ9's cpuid (x86)

/ Implemented for x86 & x86_64 bit platforms /

if defined(WIN32)

/* Specific CPUID instruction available in Windows */
__cpuid(cpuInfo, cpuInfo[0]);

elif defined(LINUX) || defined(OSX)

if defined(J9X86)

__asm volatile
("mov %%ebx, %%edi;"
        "cpuid;"
        "mov %%ebx, %%esi;"
        "mov %%edi, %%ebx;"
        :"+a" (cpuInfo[0]), "=S" (cpuInfo[1]), "=c" (cpuInfo[2]), "=d" (cpuInfo[3])
         : :"edi");

elif defined(J9HAMMER)

__asm volatile( "cpuid;" :"+a" (cpuInfo[0]), "=b" (cpuInfo[1]), "=c" (cpuInfo[2]), "=d" (cpuInfo[3]) );

endif

endif

}

- Though I think what's actually being used at the moment on x86 is this:
```cpp
#if defined(OMR_OS_WINDOWS)
#include <intrin.h>
#define cpuid(CPUInfo, EAXValue)             __cpuid(CPUInfo, EAXValue)
#define cpuidex(CPUInfo, EAXValue, ECXValue) __cpuidex(CPUInfo, EAXValue, ECXValue)
#else
#include <cpuid.h>
#include <emmintrin.h>
#define cpuid(CPUInfo, EAXValue)             __cpuid(EAXValue, CPUInfo[0], CPUInfo[1], CPUInfo[2], CPUInfo[3])
#define cpuidex(CPUInfo, EAXValue, ECXValue) __cpuid_count(EAXValue, ECXValue, CPUInfo[0], CPUInfo[1], CPUInfo[2], CPUInfo[3])
harryyu1994 commented 4 years ago

TR_Processor vs. TR_ProcessorDescription (x86)

OMRCodeGenerator with TR_ProcessorDescription

uint32_t _processorSignature = cg->comp()->target().cpu.getX86ProcessorSignature();

   if (isGenuineIntel())
      {
      switch (getCPUFamily(_processorSignature))
         {
         case 0x05: _processorDescription |= TR_ProcessorIntelPentium; break;
         case 0x06:
            {
            uint32_t extended_model = getCPUModel(_processorSignature) + (getCPUExtendedModel(_processorSignature) << 4);
            switch (extended_model)
               {
               case 0x55:
                  _processorDescription |= TR_ProcessorIntelSkylake; break;
               case 0x4f:
                  _processorDescription |= TR_ProcessorIntelBroadwell; break;
               case 0x3f:
               case 0x3c:
                  _processorDescription |= TR_ProcessorIntelHaswell; break;
               case 0x3e:
               case 0x3a:
                  _processorDescription |= TR_ProcessorIntelIvyBridge; break;
               case 0x2a:
               case 0x2d:  // SandyBridge EP
                  _processorDescription |= TR_ProcessorIntelSandyBridge; break;
               case 0x2c:  // WestmereEP
               case 0x2f:  // WestmereEX
                  _processorDescription |= TR_ProcessorIntelWestmere; break;
               case 0x1a:  // Nehalem
                  _processorDescription |= TR_ProcessorIntelNehalem; break;
               case 0x17:  // Harpertown
               case 0x0f:  // Woodcrest/Clovertown
                  _processorDescription |= TR_ProcessorIntelCore2; break;
               default:  _processorDescription |= TR_ProcessorIntelP6; break;
               }
            break;
            }
         case 0x0f: _processorDescription |= TR_ProcessorIntelPentium4; break;
         default:   _processorDescription |= TR_ProcessorUnknown; break;
         }
      }
   else if (isAuthenticAMD())
      {
      switch (getCPUFamily(_processorSignature))
         {
         case 0x05:
            if (getCPUModel(_processorSignature) < 0x04)
               _processorDescription |= TR_ProcessorAMDK5;
            else
               _processorDescription |= TR_ProcessorAMDK6;
            break;
         case 0x06: _processorDescription |= TR_ProcessorAMDAthlonDuron; break;
         case 0x0f:
            if (getCPUExtendedFamily(_processorSignature) < 6)
               _processorDescription |= TR_ProcessorAMDOpteron;
            else
               _processorDescription |= TR_ProcessorAMDFamily15h;
            break;
         default:   _processorDescription |= TR_ProcessorUnknown; break;
         }
      }

omrsysinfo_get_processor_description

intptr_t
omrsysinfo_get_x86_description(struct OMRPortLibrary *portLibrary, OMRProcessorDesc *desc)
{
    uint32_t CPUInfo[4] = {0};
    char vendor[12];
    uint32_t familyCode = 0;
    uint32_t processorSignature = 0;

    desc->processor = OMR_PROCESSOR_X86_UNKNOWN;

    /* vendor */
    omrsysinfo_get_x86_cpuid(CPUID_VENDOR_INFO, CPUInfo);
    memcpy(vendor + 0, &CPUInfo[1], sizeof(uint32_t));
    memcpy(vendor + 4, &CPUInfo[3], sizeof(uint32_t));
    memcpy(vendor + 8, &CPUInfo[2], sizeof(uint32_t));

    /* family and model */
    omrsysinfo_get_x86_cpuid(CPUID_FAMILY_INFO, CPUInfo);
    processorSignature = CPUInfo[0];
    familyCode = (processorSignature & CPUID_SIGNATURE_FAMILY) >> CPUID_SIGNATURE_FAMILY_SHIFT;
    if (0 == strncmp(vendor, CPUID_VENDOR_INTEL, CPUID_VENDOR_LENGTH)) {
        switch (familyCode) {
        case CPUID_FAMILYCODE_INTELPENTIUM:
            desc->processor = OMR_PROCESSOR_X86_INTELPENTIUM;
            break;
        case CPUID_FAMILYCODE_INTELCORE:
        {
            uint32_t modelCode  = (processorSignature & CPUID_SIGNATURE_MODEL) >> CPUID_SIGNATURE_MODEL_SHIFT;
            uint32_t extendedModelCode = (processorSignature & CPUID_SIGNATURE_EXTENDEDMODEL) >> CPUID_SIGNATURE_EXTENDEDMODEL_SHIFT;
            uint32_t totalModelCode = modelCode + extendedModelCode;

            if (totalModelCode > CPUID_MODELCODE_INTELHASWELL) {
                desc->processor = OMR_PROCESSOR_X86_INTELHASWELL;
            } else if (totalModelCode >= CPUID_MODELCODE_SANDYBRIDGE) {
                desc->processor = OMR_PROCESSOR_X86_INTELSANDYBRIDGE;
            } else if (totalModelCode >= CPUID_MODELCODE_INTELWESTMERE) {
                desc->processor = OMR_PROCESSOR_X86_INTELWESTMERE;
            } else if (totalModelCode >= CPUID_MODELCODE_INTELNEHALEM) {
                desc->processor = OMR_PROCESSOR_X86_INTELNEHALEM;
            } else if (totalModelCode == CPUID_MODELCODE_INTELCORE2) {
                desc->processor = OMR_PROCESSOR_X86_INTELCORE2;
            } else {
                desc->processor = OMR_PROCESSOR_X86_INTELP6;
            }
            break;
        }
        case CPUID_FAMILYCODE_INTELPENTIUM4:
            desc->processor = OMR_PROCESSOR_X86_INTELPENTIUM4;
            break;
        }
    } else if (0 == strncmp(vendor, CPUID_VENDOR_AMD, CPUID_VENDOR_LENGTH)) {
        switch (familyCode) {
        case CPUID_FAMILYCODE_AMDKSERIES:
        {
            uint32_t modelCode  = (processorSignature & CPUID_SIGNATURE_FAMILY) >> CPUID_SIGNATURE_MODEL_SHIFT;
            if (modelCode < CPUID_MODELCODE_AMDK5) {
                desc->processor = OMR_PROCESSOR_X86_AMDK5;
            }
            desc->processor = OMR_PROCESSOR_X86_AMDK6;
            break;
        }
        case CPUID_FAMILYCODE_AMDATHLON:
            desc->processor = OMR_PROCESSOR_X86_AMDATHLONDURON;
            break;
        case CPUID_FAMILYCODE_AMDOPTERON:
            desc->processor = OMR_PROCESSOR_X86_AMDOPTERON;
            break;
        }
    }

    desc->physicalProcessor = desc->processor;

    /* features */
    desc->features[0] = CPUInfo[3];
    desc->features[1] = CPUInfo[2];
    desc->features[2] = 0; /* reserved for future expansion */

    return 0;
}
harryyu1994 commented 4 years ago

The above 3 posts suggest that omrsysinfo_get_x86_description() is lacking:

  1. featureFlag8 (cpuidex)
  2. a few (I assume newer) processor types

I'm going to open a separate PR to handle this task.

harryyu1994 commented 4 years ago

@fjeremic Do you think it's better for x86 to go with the Power/Z way for feature query and use j9sysinfo_processor_has_feature() (with J9ProcessorDesc cached) or keep the original way? so in the latter case we would get the feature flags from j9sysinfo_get_processor_description and cache them in flags32_t and query using testAny(). The original way looks nicer, but we would have redundant things such as enum TR_X86ProcessorFeatures, which are already defined by the port library. The Power/Z way looks like old c code but it gets the job done.

enum TR_X86ProcessorFeatures
   {
   TR_BuiltInFPU                    = 0x00000001,
   TR_VirtualModeExtension          = 0x00000002,
   TR_DebuggingExtension            = 0x00000004,
   TR_PageSizeExtension             = 0x00000008,
   TR_RDTSCInstruction              = 0x00000010,
   TR_ModelSpecificRegisters        = 0x00000020,
   TR_PhysicalAddressExtension      = 0x00000040,
   TR_MachineCheckException         = 0x00000080,
   TR_CMPXCHG8BInstruction          = 0x00000100,
   TR_APICHardware                  = 0x00000200,
   // Reserved by Intel             = 0x00000400,
   TR_FastSystemCalls               = 0x00000800,
   TR_MemoryTypeRangeRegisters      = 0x00001000,
   TR_PageGlobalFlag                = 0x00002000,
   TR_MachineCheckArchitecture      = 0x00004000,
   TR_CMOVInstructions              = 0x00008000,
   TR_PageAttributeTable            = 0x00010000,
   TR_36BitPageSizeExtension        = 0x00020000,
   TR_ProcessorSerialNumber         = 0x00040000, // GenuineIntel only
   TR_CLFLUSHInstruction            = 0x00080000, // GenuineIntel only
   // Reserved by Intel             = 0x00100000,
   TR_DebugTraceStore               = 0x00200000, // GenuineIntel only
   TR_ACPIRegisters                 = 0x00400000, // GenuineIntel only
   TR_MMXInstructions               = 0x00800000,
   TR_FastFPSavesRestores           = 0x01000000,
   TR_SSE                           = 0x02000000,
   TR_SSE2                          = 0x04000000, // GenuineIntel Only
   TR_SelfSnoop                     = 0x08000000, // GenuineIntel Only
   TR_HyperThreading                = 0x10000000, // GenuineIntel Only
   TR_ThermalMonitor                = 0x20000000, // GenuineIntel Only
   // Reserved by Intel             = 0x40000000, // IA64-specific
   // Reserved by Intel             = 0x80000000
   TR_X86ProcessorInfoInitialized   = 0x80000000  // FIXME: Using a reserved bit for our purposes.
   };
fjeremic commented 4 years ago

@harryyu1994 ideally the feature flags (which are enum values on x86) should live in the port library. There is no need for us to be duplicating them in the compiler. As such it would be preferable to do it the Power way, i.e. use the J9PORT_*_FEATURE_* flag which is defined in the port library to query the feature in the compiler. This removes any indirection and all duplication and there is a single point of definition for a particular processor feature, which lives in the port library which is accessible by all OMR components.

harryyu1994 commented 4 years ago

Looked into Z/CPU: Unlike X, to determine whether a feature is enabled, it needs two pieces of information (arch and feature), as an example:

// z9 supports DFP in millicode so do not check for DFP support unless we are z10+
   if (TR::Compiler->target.cpu.getSupportsArch(TR::CPU::z10) &&
         j9sysinfo_processor_has_feature(processorDesc, J9PORT_S390_FEATURE_DFP))
      {
      TR::Compiler->target.cpu.setSupportsDecimalFloatingPointFacility(true);
      }

@fjeremic Again, need your input on this

fjeremic commented 4 years ago

@fjeremic Again, need your input on this

Option c. seems to be the most reasonable thing to do. This way the compiler doesn't muck with the port library true source of information.

harryyu1994 commented 4 years ago

The End Goal

Clean Up on Z

@fjeremic What do you think?

harryyu1994 commented 4 years ago

I'm going to keep both set of names for now.

fjeremic commented 4 years ago

I'm going to keep both set of names for now.

Keep both sets. We should aim to phase out the OMR_PROCESSOR_S390_GP* enums eventually. They are not very descriptive.

harryyu1994 commented 4 years ago

I realized there could still be a bit of work after the port library initialization decision, I'm going to not wait for that decision and try to finish off this whole issue by initializing my own omr port library for now. Just realized it doesn't look that hard to get the omr port library when we have OpenJ9 and its port library in hand.

harryyu1994 commented 4 years ago

New API against Old API tests in OpenJ9: https://github.com/eclipse/openj9/pull/9295

harryyu1994 commented 3 years ago

Latest Changes:

Summary of changes:

Old design

New design

Introduce Feature Masks

0xdaryl commented 3 years ago

@fjeremic @dsouzai : There are a lot of detailed comments in this issue tracking the progress of this work (thanks @harryyu1994!). However, can one of you please provide a summary of what work remains to be done from its current state?

@nbhuiyan will return to the port library initialization problem soon (he's working on a PR) which will make the port lib accessible within OMR and should deprecate some of the duplicate code within OMR itself.

fjeremic commented 3 years ago

AFAIK the detection has been sunk into OMR and there are two main pieces still remaining to finish:

  1. Hookup the OMR port library to the detection

    For example we have the following code: https://github.com/eclipse/omr/blob/3d85fbdf9c73e46c6da5df8989c119164711b368/compiler/env/OMRCompilerEnv.cpp#L77-L79

    Which detects the target CPU using the OMR port library. Unfortunately because the port library has not been hooked up to the compiler in OMR this value is NULL, so when we query for processor detection, for example here:https://github.com/eclipse/omr/blob/3d85fbdf9c73e46c6da5df8989c119164711b368/compiler/x/env/OMRCPU.cpp#L244-L256

    We default down to using the old API.

  2. Deprecate the old processor detection code

    When the port library is hooked up to the compiler the above queries should go away since the port library should never be NULL and the old processor detection APIs become dead at that point. We need to clean them up.

fjeremic commented 3 years ago

Just stumbled on eclipse/openj9#12670 where it seems we have duplicated code for processor detection in OpenJ9. We need to remove this as part of the migration effort here to avoid developers adding more code into the OpenJ9 versions.

@nbhuiyan wondering if you have a timeline to resume this work?

nbhuiyan commented 3 years ago

@fjeremic I plan on resuming this work later this week or early next week.

r30shah commented 2 years ago

While fixing a bug to make sure we do vectorize some of the vector API intrinsics for Float type (On Z, we need to have Vector Facility Enhancement-1 installed to generate vector instruction for float type), we found out that we do not set the _flags which were supposed to be checked when portLib is null in compiler [1][2]. While working on the change, I found out that we do not set the _flags on first place, so thought of cleaning up them but bumped into this issue.

I guess continuing discussion from Filip's comment, if we have hooked up the port library to compiler, we can clean-up the old checks so that we would not use them accidentally. Trying to find out in the code-base, if we ever had change for hooking up portlib to compiler but can not find. @0xdaryl @nbhuiyan did we finished working on remaining pieces of work for this issue and can the code-gen go ahead and clean-up old feature detection code?

[1]. https://github.com/eclipse/omr/blob/edf2ae5033cb412348dbeb74d31bab2d26fe933d/compiler/z/env/OMRCPU.cpp#L101-L102 [2]. https://github.com/eclipse/omr/blob/edf2ae5033cb412348dbeb74d31bab2d26fe933d/compiler/z/env/OMRCPU.cpp#L154

nbhuiyan commented 2 years ago

@r30shah I had the work to support the port library usage in the compiler completed a few months ago, but I was not able to get around to opening a PR for it yet. I am working on that and will update this issue soon.

r30shah commented 2 years ago

Thanks a lot @nbhuiyan , We will wait for your PR before cleaning up the old processor feature recognition code. Trying to understand so as for current state, in the function supportsFeature we are not guaranteed to have omrPortLib ? I am asking as we do not set the flags for old API anywhere in the code, and right now code in the same function does seem to call [1].

[1]. https://github.com/eclipse/omr/blob/066d80097f5c72c48456423b87cd495a7c2c8277/port/unix/omrsysinfo.c#L762

nbhuiyan commented 2 years ago

Currently while within the OMR Compiler, there is no guarantee that the port library is initialized. This is because OMR Compiler-based projects can initialize the compiler without the port library, which made it necessary to perform the check for the port library when used within OMR Compiler code. Given that the flags necessary for the old API is never set currently, supportsFeature was only returning the correct results if the port library was available.