finopsfoundation / focus_converters

Parent repository to hold all common documentation and code samples for all FOCUS Converter projects
MIT License
83 stars 44 forks source link

[Proposal] Better Documentation About Providers/Cloud Files #346

Open mirusky opened 6 months ago

mirusky commented 6 months ago

Is your feature request related to a problem? Please describe. Yes, I saw a lot of open Issues about gathering the informations from providers / clouds:

336

331

318

304

293

Almost all of them is lack of information or missing column, etc.

I think the root of problem is that providers has many reports types from many places:

Azure:

AWS:

GCP:

Describe the solution you'd like I believe that we should better document what each transform is related to. Since Providers has many types, from many places.

The issues created probably is because someone doesn't have access to the correct report and are exporting one that is not mapped. It can be solved by providing a better understaing where to gather the information.

Describe alternatives you've considered

Additional context

mirusky commented 6 months ago

@varunmittal91 what you think?

Viha27 commented 6 months ago

I have also encountered the issue of missing columns when retrieving reports from different places within the same cloud provider. I agree with the solution proposed by @mirusky. The best approach is to either select a report format from each cloud provider for the converter or specify the report source using an input variable (e.g., --report-source) to handle this issue effectively.

varunmittal91 commented 6 months ago

Thank you @Viha27 and @mirusky I am sorry for being away.

I like the idea, can I please request you to create a pr on some dimensions. We can start with a handful of dimensions (may be even 3) and iron out the details on what the format could looks like.

Also, I was wondering for some dimensions like ListUnitPricing, we would need support for api calls and would need to add custom calls. So this format could also become a full blown plugin system.

I am also super curious to create this as a template for vendors to be able to pick up this as a base library for general transforms.

I am back and working again, please let me know how can I be helpful.

mirusky commented 6 months ago

Hi @varunmittal91

By dimensions what you mean?

For sure I could provide some examples and how each file / API is returning data, what column exists and the ones that doesn't exists. This could help the FOCUS team in development, even the open issues with some kind of workaround.

I am also super curious to create this as a template for vendors to be able to pick up this as a base library for general transforms.

Yeah, this would be amazing. I'm not familiar with python itself, but I can help with some ideas and perhaps test data.

varunmittal91 commented 6 months ago

Sorry, @mirusky by dimensions I meant columns.

That will be really awesome, please let me know how can we can collaborate on certain columns/providers.

Viha27 commented 6 months ago

How about either select a standardized report format from each cloud provider for the converter or specify the report source using an input variable (e.g., --report-source) to handle this issue effectively. Implementing exception handling for missing data columns will result in non-value/null columns, which is not meaningful.

@varunmittal91 Is there a move forward plan on JSON support in FOCUS converters ?

mirusky commented 5 months ago

I'll work on GCP today, I will try to map the Standard and Detailed Usage ( they are pretty similar, just a few columns different ).

I'm not sure where should I put the files itself so I'll provide an response here and then we could move on to a PR.

mirusky commented 5 months ago

I don't know the current format that is being expected by the converter and where this one is gathered, so I've just listed it to show the diference between them.

Current Standard Detailed Resellers
adjustment_info adjustment_info adjustment_info
adjustment_info.description adjustment_info.description adjustment_info.description
adjustment_info.id adjustment_info.id adjustment_info.id
adjustment_info.mode adjustment_info.mode adjustment_info.mode
adjustment_info.type adjustment_info.type adjustment_info.type
billing_account_id billing_account_id billing_account_id billing_account_id
channel_partner_cost
channel_partner_name
channel_partner_repricing_config_name
cost cost cost cost
cost_at_list cost_at_list cost_at_list cost_at_list
cost_type cost_type cost_type cost_type
credits credits credits credits
credits.amount credits.amount credits.amount credits.amount
credits.channel_partner_amount
credits.customer_amount
credits.full_name credits.full_name credits.full_name
credits.id credits.id credits.id
credits.name credits.name credits.name
credits.type credits.type credits.type
currency currency currency currency
currency_conversion_rate currency_conversion_rate currency_conversion_rate
customer_correlation_id
customer_cost
customer_name
customer_repricing_config_name
entitlement_name
export_time export_time export_time
invoice.month invoice.month invoice.month invoice.month
labels.key labels.key labels.key
labels.value labels.value labels.value
location.country location.country location.country
location.location location.location location.location location.location
location.region location.region location.region location.region
location.zone location.zone location.zone location.zone
payer_billing_account_id
price price
price.effective_price price.effective_price
price.pricing_unit_quantity price.pricing_unit_quantity
price.tier_start_amount price.tier_start_amount price.tier_start_amount
price.unit price.unit
project project project
project.ancestors project.ancestors
project.ancestors.display_name project.ancestors.display_name
project.ancestors.resource_name project.ancestors.resource_name
project.ancestry_numbers project.ancestry_numbers project.ancestry_numbers
project.id project.id project.id
project.labels.key project.labels.key project.labels.key
project.labels.value project.labels.value project.labels.value
project.name project.name project.name
project.number project.number project.number
resource resource
resource.global_name resource.global_name resource.global_name
resource.name resource.name resource.name
seller_name seller_name seller_name seller_name
service.description service.description service.description service.description
service.id service.id service.id
sku.description sku.description sku.description
sku.id sku.id sku.id sku.id
subscription subscription
subscription.instance_id subscription.instance_id
system_labels.key system_labels.key system_labels.key
system_labels.value system_labels.value system_labels.value
tags tags tags
tags.inherited tags.inherited tags.inherited
tags.key tags.key tags.key
tags.namespace tags.namespace tags.namespace
tags.value tags.value tags.value
transaction_type transaction_type transaction_type
usage.amount usage.amount usage.amount usage.amount
usage.amount_in_pricing_units usage.amount_in_pricing_units usage.amount_in_pricing_units usage.amount_in_pricing_units
usage.pricing_unit usage.pricing_unit usage.pricing_unit usage.pricing_unit
usage.unit usage.unit usage.unit usage.unit
usage_end_time usage_end_time usage_end_time usage_end_time
usage_start_time usage_start_time usage_start_time usage_start_time

The markdown itself to anyone explore: GCP - Formats.md

@varunmittal91 that are the ones for GCP.

As we can see each report is kind of a subset of resellers ones.

bahung commented 5 months ago

@mirusky did you try the converter with these GCP formats? I am curious about it outputs. I might try with Azure

mirusky commented 5 months ago

@bahung I didn't tried to run but I think the standard GCP report is broken because it doesn't contains price.tier_start_amount, resource.global_name and resource.name but the others ( detailed and resellers ) are good.

Azure I didn't have time to gather all the reports types availables to create a similar table. But I would love to see a column mapping between each report type.

bahung commented 4 months ago
@mirusky, sorry for being late. Here is a comparison of Azure EA columns with those outputted by the converter tool. df_azure_EA df_azure_out
AccountName AccountName AccountName
AccountOwnerId AccountOwnerId AccountOwnerId
AdditionalInfo AdditionalInfo AdditionalInfo
AvailabilityZone AvailabilityZone AvailabilityZone
BilledCost nan BilledCost
BillingAccountId BillingAccountId BillingAccountId
BillingAccountName BillingAccountName BillingAccountName
BillingCurrency BillingCurrency BillingCurrency
BillingCurrencyCode BillingCurrencyCode BillingCurrencyCode
BillingPeriodEnd nan BillingPeriodEnd
BillingPeriodEndDate BillingPeriodEndDate BillingPeriodEndDate
BillingPeriodStart nan BillingPeriodStart
BillingPeriodStartDate BillingPeriodStartDate BillingPeriodStartDate
BillingProfileId BillingProfileId BillingProfileId
BillingProfileName BillingProfileName BillingProfileName
ChargeCategory nan ChargeCategory
ChargeDescription nan ChargeDescription
ChargeFrequency nan ChargeFrequency
ChargePeriodEnd nan ChargePeriodEnd
ChargePeriodStart nan ChargePeriodStart
ChargeType ChargeType ChargeType
CommitmentDiscountCategory nan CommitmentDiscountCategory
CommitmentDiscountId nan CommitmentDiscountId
CommitmentDiscountName nan CommitmentDiscountName
CommitmentDiscountType nan CommitmentDiscountType
ConsumedQuantity nan ConsumedQuantity
ConsumedService ConsumedService ConsumedService
ConsumedUnit nan ConsumedUnit
Cost Cost Cost
CostAllocationRuleName CostAllocationRuleName CostAllocationRuleName
CostCenter CostCenter CostCenter
CostInBillingCurrency CostInBillingCurrency CostInBillingCurrency
Date Date Date
EffectiveCost nan EffectiveCost
EffectivePrice EffectivePrice EffectivePrice
Frequency Frequency Frequency
InvoiceIssuer nan InvoiceIssuer
InvoiceSection InvoiceSection InvoiceSection
InvoiceSectionId InvoiceSectionId InvoiceSectionId
IsAzureCreditEligible IsAzureCreditEligible IsAzureCreditEligible
ListCost nan ListCost
ListUnitPrice nan ListUnitPrice
MeterCategory MeterCategory MeterCategory
MeterId MeterId MeterId
MeterName MeterName MeterName
MeterRegion MeterRegion MeterRegion
MeterSubCategory MeterSubCategory MeterSubCategory
OfferId OfferId OfferId
PartNumber PartNumber PartNumber
PayGPrice PayGPrice PayGPrice
PlanName PlanName PlanName
PricingCategory nan PricingCategory
PricingModel PricingModel PricingModel
PricingQuantity nan PricingQuantity
PricingUnit nan PricingUnit
Product Product Product
ProductName ProductName ProductName
ProductOrderId ProductOrderId ProductOrderId
ProductOrderName ProductOrderName ProductOrderName
Provider nan Provider
Publisher nan Publisher
PublisherName PublisherName PublisherName
PublisherType PublisherType PublisherType
Quantity Quantity Quantity
RegionId nan RegionId
ReservationId ReservationId ReservationId
ReservationName ReservationName ReservationName
ResourceGroup ResourceGroup ResourceGroup
ResourceId ResourceId ResourceId
ResourceLocation ResourceLocation ResourceLocation
ResourceName ResourceName ResourceName
ResourceType nan ResourceType
ServiceCategory nan ServiceCategory
ServiceFamily ServiceFamily ServiceFamily
ServiceInfo1 ServiceInfo1 ServiceInfo1
ServiceInfo2 ServiceInfo2 ServiceInfo2
ServiceName nan ServiceName
SkuId nan SkuId
SkuPriceId nan SkuPriceId
SubAccountId nan SubAccountId
SubAccountName nan SubAccountName
SubscriptionId SubscriptionId SubscriptionId
SubscriptionName SubscriptionName SubscriptionName
Tags Tags Tags
Term Term Term
UnitOfMeasure UnitOfMeasure UnitOfMeasure
UnitPrice UnitPrice UnitPrice
benefitId benefitId benefitId
benefitName benefitName benefitName
gjcampbell commented 1 month ago

I believe that in addition to better documentation around how to set up the cost export, the conversion process will, at least eventually, need to support the different agreement types and schema versions for Azure.

It looks like the conversion configs for azure were developed against a specific version of the EA (enterprise agreement) schema -evident in the PascalCase field names. The schema for MCA and a couple others use camelCase.

Also, each agreement schema's versions have different fields. For example, MCA 2019-11-01 does not have ProductName for (which is mapped to the ChargeDescription) but has an equivalent field product. The next MCA version, does have ProductName.

I'm not suggesting adding conversion configs for each historical agreement type, though it would be nice. I suggesting that, going forward, the conversion configs should be made schema-version specific.

Given the current project structure, it would probably be quite a maintenance problem to duplicate the configs for each provider's schema-versions (I don't know if schema-version issues are isolated to Azure), so perhaps a normalization process should be introduced. Normalization might mean detecting the version of the schema in each file and applying transformations to adjust the incoming data to match the conversion config's ideal provider-schema-version.

Here are a couple of reasons for this approach.