Open k1m0ch1 opened 5 months ago
trying this query in local
SELECT
items.id,
items.sku,
items.name,
prices.price,
prices.created_at
FROM
items
INNER JOIN
prices
ON
items.id = prices.items_id
WHERE
items.id = '1' AND
prices.created_at LIKE '2024-03-10%'
it tooks around 6 millisecond
same with this query
SELECT
items.id,
items.sku,
items.name,
discounts.discount_price,
discounts.original_price,
discounts.created_at
FROM
items
INNER JOIN
discounts
ON
items.id = discounts.items_id
WHERE
items.id = '1' AND
discounts.created_at LIKE '2024-03-10%'
it tooks around 5 millisecond, while this query also tooks 3 millisecond
select id from items where items.sku = 'A09130001976' and items.name = 'Sasa Gourmet Powder MSG 250 g'
so the process will added around 11 millisecond, this mean every item will access sqlite with 14 or 15 millisecond, now if the record of items is 33828
this mean 33828 x 15 = 507420
millisecond, 507420 millisecond
= 507 second
if usually the process tooks 20 minutes for 3 millisecond, the process might tooks longer twice, around 40 minutes each crawler.
yogya online working like this
here is for alfagift
the scrapper is longer 10 minutes from normal
dope 🔥
The actual
alfagift tooks about an hour 4 minutes
yogya online around less than an hour, wtf awkawk
anjir 2 jam
Problem: The data is duplicated and makes the size of database is HUGE, with current DB size is 2.4GB
you can see it from here every data always have like 6 times more every day with query
select * from prices where items_id='1' AND created_at LIKE '2024-01-15%'
the
prices
have the 7 million recordnow lets break up by select the data from around
2024-01-15
, the query tooks 700millisecond with result around 77290 recordif I do group by the data with
items_id
the query tooks 50second with record around 16438this mean the data is grow 470% from original size, this is why the DB is HUGE
This is the common script to store the data
the problem with this script it will ALWAYS save a new record,
the challenge is if I select and join the quer is it going to be longer ?
if
checkIdItem
is O1 to resolve the prochecklist