MediaJel / mediajel-tracker

hosted mediajel tracker
4 stars 1 forks source link

Research into custom integrations with fewer interactions #430

Open pacholoamit opened 3 months ago

pacholoamit commented 3 months ago

Description

The goal of this ticket is to do some research into an implementation where we reduce the interactions we'd need to make for creating custom integrations

pacholoamit commented 3 months ago

Perhaps take a look at webscraping?

jlaridamediajel commented 3 months ago

Initial Guide with regards to this concept.

-tech stack:

Functionalities:

1.) Meta Tags and Comments: Some e-commerce platforms include specific meta tags or comments in the HTML source. For example, Shopify often includes meta tags like . JavaScript Files and Resources:

2.) Specific JavaScript files or resources loaded by the page can hint at the e-commerce platform. For instance, URLs containing wp-content or woocommerce suggest a WooCommerce store, while URLs with cdn.shopify.com suggest Shopify. CSS Classes and IDs:

3.) Certain CSS classes and IDs are unique to specific platforms. For example, classes like woocommerce indicate WooCommerce, and classes like shopify-section indicate Shopify.

4.)Form Actions and Hidden Inputs: The action attribute of forms and hidden input fields often contain platform-specific URLs or values.

5.) HTTP Headers: Some platforms include specific HTTP headers in their responses. For example, Shopify may include headers like x-shopify-stage.

example code concept: `const axios = require('axios'); const cheerio = require('cheerio');

async function detectShopify(url) { try { const response = await axios.get(url); const $ = cheerio.load(response.data);

    // Check for Shopify-specific meta tags
    const shopifyMetaTag = $('meta[name="shopify-digital-wallet"]').length > 0;
    if (shopifyMetaTag) {
        return true;
    }

    // Check for Shopify-specific script URLs
    let shopifyScript = false;
    $('script[src]').each((index, element) => {
        if ($(element).attr('src').includes('cdn.shopify.com')) {
            shopifyScript = true;
            return false; // Exit loop early if found
        }
    });

    return shopifyScript;
} catch (error) {
    console.error(`Error fetching the URL: ${error}`);
    return false;
}

}

// Example usage const url = 'https://example-shopify-store.com'; detectShopify(url).then(isShopify => { console.log(Is Shopify: ${isShopify}); }); `

6.)Backend-specific URLs: (Needed ) URLs in the site's structure can also be indicative, such as /cart or /checkout paths, which might follow certain patterns unique to each platform.

exaple code to check for links ` const axios = require('axios'); const cheerio = require('cheerio');

async function fetchCartCheckoutLinks(url) { try { const response = await axios.get(url); const $ = cheerio.load(response.data);

    // Selectors for all anchor tags
    const selectors = ['a'];

    const cartCheckoutLinks = new Set();

    // Iterate over the selectors to find links
    selectors.forEach(selector => {
        $(selector).each((index, element) => {
            const link = $(element).attr('href');
            if (link && (link.includes('/cart') || link.includes('/checkout'))) {
                cartCheckoutLinks.add(link);
            }
        });
    });

    return Array.from(cartCheckoutLinks);
} catch (error) {
    console.error(`Error fetching the URL: ${error}`);
    return [];
}

}

// Example usage const url = 'https://example-website.com'; fetchCartCheckoutLinks(url).then(links => { console.log('Cart and Checkout Links found:'); links.forEach(link => console.log(link)); });`

*** Purpose for this is to technically determine the tag that needs to be placed into the site.

2.) Automation on how we Check for our the values gathered using snowplow. The concept of this is similar to how jest works. Once Step #1 is done, we will inject the necessary <script tag on our automation session using (***testcafe) checking for the necessary values that should be present based on our tags.

*** still a blury concept but something like this code ` import { Selector, ClientFunction } from 'testcafe'; import fs from 'fs'; import path from 'path';

// Load custom script from file system or probably a link to our javascript tag const customScript = fs.readFileSync(path.resolve(__dirname, 'customScript.js'), 'utf8');

// Function to inject custom script const injectScript = ClientFunction((script) => { const scriptElement = document.createElement('script'); scriptElement.text = script; document.head.appendChild(scriptElement); });

// Function to get the value of the custom variable from the window object const getCustomVariable = ClientFunction(() => window.myCustomVariable);

fixture Custom JS interaction with external site .page https://example.com;

test('should get values from external site DOM elements using custom script', async t => { // Inject custom script await injectScript(customScript);

// Retrieve values after custom script has been executed
const h1Text = await Selector('h1').innerText;
const customVariableValue = await getCustomVariable();

// Log the results or perform assertions
console.log('h1Text:', h1Text);
console.log('customVariableValue:', customVariableValue);

await t.expect(h1Text).eql('Hello World!');
await t.expect(customVariableValue).eql('Test Value');

}); `

3.) Display Result evaluation This will display the percentage of the results of the previous steps.

something like:

ecommerce platform detection: woocommerce - 100%

Tag values detection: window.transaction - 0%

*This might need to create a custom tag or adjust our tag script. Please Contact Integrations for support

Other things to consider -

*one other thing is to utilize ML for checking the sites source file as well as parse checker for specific elements that we usually do for creating a tag. I am not sure though how the current ones are capable but I think we can train one specifically for our utilization. There is a python based plugin named "scikit-learn" which we can utilize as a start on ML development but its environment is on python. (Bit ambitious but nice to start with for nice functionality)

pacholoamit commented 2 months ago

We're gonna put this on hold for now