bancodobrasil / stop-analyzing-api

Stop Analyzing API is the core module of Tinder like tool to help your customers make up their mind with no pain
MIT License
22 stars 17 forks source link

Complement crawled data #27

Open tiagostutz opened 4 years ago

tiagostutz commented 4 years ago

The result of the crawler made by @douglasferlini in #5 is a JSON array with the overall information of the dresses os La Fiancee. Now, we need now to get the details of the dresses and generate a more detailed JSON array.

To do so, you will read the attached JSON and will make a request for the product API providing the urlPartin the following format:

const productID = <urlPart_from_json_array>
await fetch("https://www.lafiancee.com.br/_api/wix-ecommerce-storefront-web/api", {
    "credentials": "include",
    "headers": {
        "Accept": "*/*",
        "Authorization": "brUTfgwc9eaqQ4m_KjbIkjnR-MRt9rGfCLGikGEPiRU.eyJpbnN0YW5jZUlkIjoiMWI0OTQ1ODItZDg5Zi00MmY2LTg0YzAtNTAxOGE3NzI1Y2MyIiwiYXBwRGVmSWQiOiIxMzgwYjcwMy1jZTgxLWZmMDUtZjExNS0zOTU3MWQ5NGRmY2QiLCJtZXRhU2l0ZUlkIjoiN2RlM2ExNjgtNDEyNC00NDljLTg4ZDYtZmViNjkzYWY3NzRjIiwic2lnbkRhdGUiOiIyMDIwLTA5LTIzVDEyOjI3OjE4LjUyOVoiLCJ2ZW5kb3JQcm9kdWN0SWQiOiJQcmVtaXVtMSIsImRlbW9Nb2RlIjpmYWxzZSwiYWlkIjoiOWE0ZjJjNDAtMTIzNC00ZGM3LTg3OWEtMjIzZDMxMzI0N2E1IiwiYmlUb2tlbiI6IjY2YWFlNGVhLTk5YmItMDY2YS0wYzE2LWFlYWUzNGRkMmI4ZSIsInNpdGVPd25lcklkIjoiZmI0Y2Y2ODQtODZkZS00N2E0LWE2NjUtZjE4ZDcxYzA3YzUxIn0",
        "Content-Type": "application/json; charset=utf-8",
    },
    "body": `{"query":"query getProductBySlug($externalId: String!, $slug: String!, $withPricePerUnit: Boolean!, $withCountryCodes: Boolean!) {
          appSettings(externalId: $externalId) {
            widgetSettings
      }
      catalog {
            product(slug: $slug, onlyVisible: true) {
                id
                description
                isVisible
                sku
                ribbon
                price
                comparePrice
                discountedPrice
                formattedPrice
                formattedComparePrice
                formattedDiscountedPrice
                pricePerUnit @include(if: $withPricePerUnit)
                formattedPricePerUnit @include(if: $withPricePerUnit)
                pricePerUnitData @include(if: $withPricePerUnit) {
                baseQuantity
                baseMeasurementUnit
          }
          seoTitle
          seoDescription
          createVersion
          digitalProductFileItems {
                fileId
                fileType
                fileName
          }
          productItems {
                price
                comparePrice
                formattedPrice
                formattedComparePrice
                pricePerUnit @include(if: $withPricePerUnit)
                formattedPricePerUnit @include(if: $withPricePerUnit)
                optionsSelections
                isVisible
                inventory {
                status
                quantity
            }
            sku
            weight
            surcharge
            subscriptionPlans {
                list {
                id
                price
                formattedPrice
                pricePerUnit @include(if: $withPricePerUnit)
                formattedPricePerUnit @include(if: $withPricePerUnit)
              }
            }
          }
          name
          isTrackingInventory
          inventory {
            status
            quantity
          }
          isVisible
          isManageProductItems
          isInStock
          media {
            id
            url
            fullUrl
            altText
            thumbnailFullUrl: fullUrl(width: 50, height: 50)
            mediaType
            videoType
            videoFiles {
                url
                width
                height
                format
                quality
            }
            width
            height
            index
            title
          }
          customTextFields {
            title
            isMandatory
            inputLimit
          }
          nextOptionsSelectionId
          options {
            title
            optionType
            selections {
                id
                value
                description
                linkedMediaItems {
                    altText
                    url
                    fullUrl
                    thumbnailFullUrl: fullUrl(width: 50, height: 50)
                    mediaType
                    width
                    height
                    index
                    title
                    videoFiles {
                        url
                        width
                        height
                        format
                        quality
                    }
                }
            }
          }
          productType
          urlPart
          additionalInfo {
                id
            title
            description
            index
          }
          subscriptionPlans {
                list(onlyVisible: true) {
                  id
              name
              tagline
              frequency
              duration
              price
              formattedPrice
              pricePerUnit @include(if: $withPricePerUnit)
              formattedPricePerUnit @include(if: $withPricePerUnit)
            }
            oneTimePurchase {
                  index
            }
          }
          discount {
                mode
            value
          }
          currency
          weight
          seoJson
        }
      }
      localeData(language: "en") @include(if: $withCountryCodes) {
            countries {
              key
          shortKey
        }
      }
    }","variables":{"slug":productID,"externalId":"","withPricePerUnit":true,"withCountryCodes":false},"source":"WixStoresWebClient","operationName":"getProductBySlug"}`,
    "method": "POST",
});

This request will return a JSON that has the product options with title and selections which will be the features.

With this enhanced JSON Array we can build the database to serve this data.

afiorentino commented 4 years ago

Hi! I believe I should be able to tackle this. I've already forked the repository to https://www.github.com/afiorentino/stop-analyzing-api

tiagostutz commented 4 years ago

Awesome, @afiorentino !! Just assigned to you this task. If you think it would be better to handle it in another repository or using another programming language, feel free to do so, because this is not the core of stop-analyzing-api. But if you are comfortable doing it here, it is perfect too!

Thanks!

tiagostutz commented 4 years ago

Oh I forgot to attach the JSON crawled by @douglasferlini. Here it goes:

lafiancee.json.zip

afiorentino commented 4 years ago

Hi, @tiagostutz

I'be been working on this task and I believe that I'm almost finished. I I'm curious if what I'm building is intended to be specific to the lafiancee site, or if I should try and be more generic so that part(s) of my contribution can be reused.

tiagostutz commented 4 years ago

This case will be La Fiancee specific because this payload is from its website. But, as this site is made using Wix I think this would be an easy path to use this work with Wix based sites, but it is not a rule.

afiorentino commented 4 years ago

Hi @tiagostutz

I appreciate your patience with me regarding the completion of this task. I'm nearly complete, but I'm getting a 400 response code for the URL in the above code snippet. My repository is located at https://github.com/afiorentino/stop-analyzing-lafiancee-enhanced-json

I would love any feedback you have for me at this current stage and also a solution to the 400 response code I'm getting from the lafiancee website.

tiagostutz commented 4 years ago

Hi there @afiorentino.!

You mean, 400 from this link: https://github.com/bancodobrasil/stop-analyzing-api/files/5276983/lafiancee.json.zip ? Check it please, if you can go through.

I'll take a look at your repo and will place my observations as issues on your repo. Sounds good?

tiagostutz commented 3 years ago

@KarineValenca could you tackle this?

KarineValenca commented 3 years ago

Sure!

KarineValenca commented 3 years ago

@tiagostutz I'm developing the code in this repo https://github.com/KarineValenca/stop-analyzing-enhanced-json However, I have some questions. For each one of the urlPart in the attached JSON, I'm generating receiving the following response:

{
   "data":{
      "appSettings":{
         "widgetSettings":{

         }
      },
      "catalog":{
         "product":{
            "id":"1f48a12b-6dc9-4a1f-be94-e0b215a77210",
            "description":"",
            "isVisible":true,
            "sku":"cod316",
            "ribbon":"",
            "price":0.0,
            "comparePrice":0.0,
            "discountedPrice":0.0,
            "formattedPrice":"R$ 0,00",
            "formattedComparePrice":"",
            "formattedDiscountedPrice":"R$ 0,00",
            "pricePerUnit":null,
            "formattedPricePerUnit":null,
            "pricePerUnitData":null,
            "seoTitle":null,
            "seoDescription":null,
            "createVersion":1568930104742000,
            "digitalProductFileItems":[

            ],
            "productItems":[

            ],
            "name":"Marca Melissa Sweet",
            "isTrackingInventory":false,
            "inventory":{
               "status":"in_stock",
               "quantity":0
            },
            "isManageProductItems":false,
            "isInStock":true,
            "media":[
               {
                  "id":"daf591_e1a483b66a084ceea6c8e225e395538c~mv2.jpg",
                  "url":"daf591_e1a483b66a084ceea6c8e225e395538c~mv2.jpg",
                  "fullUrl":"https://static.wixstatic.com/media/daf591_e1a483b66a084ceea6c8e225e395538c~mv2.jpg/v1/fit/w_500,h_500,q_90/file.jpg",
                  "altText":null,
                  "thumbnailFullUrl":"https://static.wixstatic.com/media/daf591_e1a483b66a084ceea6c8e225e395538c~mv2.jpg/v1/fit/w_50,h_50,q_90/file.jpg",
                  "mediaType":"PHOTO",
                  "videoType":null,
                  "videoFiles":[

                  ],
                  "width":540,
                  "height":810,
                  "index":0,
                  "title":"melissa sweet 2 (Copy).jpg"
               },
               {
                  "id":"daf591_35439db0ade24cb49c16a27c9832ac56~mv2.jpg",
                  "url":"daf591_35439db0ade24cb49c16a27c9832ac56~mv2.jpg",
                  "fullUrl":"https://static.wixstatic.com/media/daf591_35439db0ade24cb49c16a27c9832ac56~mv2.jpg/v1/fit/w_500,h_500,q_90/file.jpg",
                  "altText":null,
                  "thumbnailFullUrl":"https://static.wixstatic.com/media/daf591_35439db0ade24cb49c16a27c9832ac56~mv2.jpg/v1/fit/w_50,h_50,q_90/file.jpg",
                  "mediaType":"PHOTO",
                  "videoType":null,
                  "videoFiles":[

                  ],
                  "width":540,
                  "height":810,
                  "index":1,
                  "title":"melissa sweet 1 (Copy).jpg"
               }
            ],
            "customTextFields":[

            ],
            "nextOptionsSelectionId":6,
            "options":[
               {
                  "title":"Estilo",
                  "optionType":"DROP_DOWN",
                  "selections":[
                     {
                        "id":1,
                        "value":"Boho Chic",
                        "description":"Boho Chic",
                        "linkedMediaItems":null
                     }
                  ]
               },
               {
                  "title":"Modelagem",
                  "optionType":"DROP_DOWN",
                  "selections":[
                     {
                        "id":2,
                        "value":"Evasê",
                        "description":"Evasê",
                        "linkedMediaItems":null
                     }
                  ]
               },
               {
                  "title":"Corpete",
                  "optionType":"DROP_DOWN",
                  "selections":[
                     {
                        "id":3,
                        "value":"Alcinhas",
                        "description":"Alcinhas",
                        "linkedMediaItems":null
                     },
                     {
                        "id":4,
                        "value":"Decote em coração",
                        "description":"Decote em coração",
                        "linkedMediaItems":null
                     }
                  ]
               },
               {
                  "title":"Tecido",
                  "optionType":"DROP_DOWN",
                  "selections":[
                     {
                        "id":5,
                        "value":"Renda 3D",
                        "description":"Renda 3D",
                        "linkedMediaItems":null
                     }
                  ]
               }
            ],
            "productType":"physical",
            "urlPart":"marca-melissa-sweet",
            "additionalInfo":[

            ],
            "subscriptionPlans":{
               "list":[

               ],
               "oneTimePurchase":null
            },
            "discount":{
               "mode":"PERCENT",
               "value":0.0
            },
            "currency":"BRL",
            "weight":0.0,
            "seoJson":null
         }
      }
   }
}

This is right?

Another question, do I need to aggregate the JSON responses in one single file? If yes, how should I aggregate this?

tiagostutz commented 3 years ago

Hi @KarineValenca ! Sorry for the late response Yes, this is the JSON you posted is for a single product, right?

The resulting aggregated JSON for that would be something like:

[
   {
      "id":"5f355a24ccb4180025ee98ab",
      "title":"Marca Melissa Sweet",
      "subtitle":"",
      "contentURL":"https://www.lafiancee.com.br/product-page/marca-melissa-sweet",
      "media":[
         "https://static.wixstatic.com/media/daf591_e1a483b66a084ceea6c8e225e395538c~mv2.jpg/v1/fit/w_500,h_500,q_90/file.jpg",
         "https://static.wixstatic.com/media/daf591_35439db0ade24cb49c16a27c9832ac56~mv2.jpg/v1/fit/w_500,h_500,q_90/file.jpg"
      ],
      "attributes":[
         {
            "estilo":"Boho Chic"
         },
         {
            "modelagem":"Evasê"
         },
         {
            "corpete":"Alcinhas"
         },
         {
            "corpete":"Decote em coração"
         },
         {
            "tecido":"Renda 3D"
         }
      ]
   },
...
]

There was a discussion about this JSON format at https://github.com/bancodobrasil/stop-analyzing-api/issues/29#issuecomment-702963899 and https://github.com/bancodobrasil/stop-analyzing-api/issues/16#issuecomment-676524806, but now that we are working on a real use case we can see that there were some pitfalls there.

We probably have to update the Database modeling to reflect some changes here, like:

Another important thing to note is that there will be one value in the attributes field for each value in selections array of the options attribute in the LaFiancee JSON, as you can see with the corpete attribute.

Have I filled all the gaps or do you have some more doubts? Please share here if so.

Thanks!

KarineValenca commented 3 years ago

@tiagostutz Your explication was clear! I'm just not sure how to fill the id, subtitle, and contentURL. Could you help me with that? After knowing that, I'll finish the enhanced JSON.

tiagostutz commented 3 years ago

Hi @KarineValenca

If you have any more questions, please post here.! =)

KarineValenca commented 3 years ago

Hi, again @tiagostutz. This is the enhanced JSON I built: https://github.com/KarineValenca/stop-analyzing-enhanced-json/blob/master/enhanced.json

Please check it out. Any problems, let me know.

tiagostutz commented 3 years ago

Perfect.!! That's the expected JSON. Thanx! =)

Is this project runnable locally? I mean, if we need to download an updated JSON from La Fiancee site and parse again, is it possible? If so, could you provide a README on how to do so?

KarineValenca commented 3 years ago

Yes, it is possible. I wrote the instructions here: https://github.com/KarineValenca/stop-analyzing-enhanced-json/blob/master/README.md

We just need to update the lafiancee.json file with the new data, and it will fetch the new data.