shamim8888 / asterixdb

Automatically exported from code.google.com/p/asterixdb
0 stars 0 forks source link

materialization during loading #896

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
When loading pre-sorted data, WAF files larger than the original input size are 
materialized on the loading node. See the plan below, and also the list of the 
waf files. Just before it failed, a WAF file was 70G each (x 10 files). The 
input size to load is 585G. The loading statement took 16 hours (and then 
failed). 

load dataset page_views using localfs
(("path"="sensorium-11.ics.uci.edu:///home/kereno/more/270M-sens/pigmix/ADB/page
_views/merged-270M-parsed-ADB/merged-270M-ADB-page-views"),("format"="adm"))pre-
sorted;

use dataverse kereno;

create type page_info_type as open {}

create type page_views_type as closed {
        id: int32, 
        user: string?,
        action: int32,
        timespent: int32,
        query_term: string?,
        ip_addr: int32,
        timestamp: int32,
        estimated_revenue: double?,
        page_info: page_info_type,
        page_links: {{ page_info_type}}?
}

Few questions:
- Loading the exact same dataset on the same machine (sensorium-11) was 
successful (months ago). What could have changed??
- Would loading the data from several machines help? (Currently splitting the 
input file and sending it to several nodes)
- Why are WAF files generated on pre-sorted data (from a single source) and not 
simply loaded?
- Why is AsterixDB holding on to these files until you stop the instance and 
only then releases the disk space?

Thanks,
Keren

sink
-- SINK  |PARTITIONED|
  exchange 
  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
    project ([])
    -- STREAM_PROJECT  |PARTITIONED|
      exchange 
      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
        insert into kerena:page_views from %0->$$1 partitioned by [%0->$$2] [bulkload]
        -- BULKLOAD  |PARTITIONED|
          exchange 
          -- HASH_PARTITION_MERGE_EXCHANGE MERGE:[$$2(ASC)] HASH:[$$2]  |PARTITIONED|
            assign [$$2] <- [function-call: asterix:field-access-by-index, Args:[%0->$$1, AInt32: {0}]]
            -- ASSIGN  |PARTITIONED|
              exchange 
              -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                data-scan []<-[$$1] <- loadable_dv:loadable_ds
                -- DATASOURCE_SCAN  |PARTITIONED|
                  exchange 
                  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                    empty-tuple-source
                    -- EMPTY_TUPLE_SOURCE  |PARTITIONED|

sensorium_sata_tank_2/extra/kereno0)
java      14127    kereno  241u      REG              253,2 80159703040   
39977030 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$15410380363572576594.waf 
(deleted)
java      14127    kereno  242r      REG              253,2 80159703040   
39977030 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$15410380363572576594.waf 
(deleted)
java      14127    kereno  243u      REG              253,2 80164691968   
39977031 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$06794498498075120972.waf 
(deleted)
java      14127    kereno  244r      REG              253,2 80164691968   
39977031 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$06794498498075120972.waf 
(deleted)
java      14127    kereno  245u      REG              253,2 80149217280   
39977032 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$28815598462001012435.waf 
(deleted)
java      14127    kereno  246r      REG              253,2 80149217280   
39977032 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$28815598462001012435.waf 
(deleted)
java      14127    kereno  247u      REG              253,2 80143974400   
39977033 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$43118451391367868377.waf 
(deleted)
java      14127    kereno  248r      REG              253,2 80143974400   
39977033 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$43118451391367868377.waf 
(deleted)
java      14127    kereno  249u      REG              253,2 80147906560   
39977034 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$77320530760352885739.waf 
(deleted)
java      14127    kereno  250r      REG              253,2 80147906560   
39977034 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$77320530760352885739.waf 
(deleted)
java      14127    kereno  253u      REG              253,2 80138731520   
39977037 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$31484805822100819337.waf 
(deleted)
java      14127    kereno  254r      REG              253,2 80138731520   
39977037 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$31484805822100819337.waf 
(deleted)
java      14127    kereno  255u      REG              253,2 80154460160   
39977038 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$53109829179125894197.waf 
(deleted)
java      14127    kereno  256r      REG              253,2 80154460160   
39977038 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$53109829179125894197.waf 
(deleted)
java      14127    kereno  257u      REG              253,2 80126935040   
39977039 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$82953103814847607270.waf 
(deleted)
java      14127    kereno  258r      REG              253,2 80126935040   
39977039 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$82953103814847607270.waf 
(deleted)
java      14127    kereno  259u      REG              253,2 80157081600   
39977040 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$9938873792235505806.waf 
(deleted)
java      14127    kereno  260r      REG              253,2 80157081600   
39977040 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$9938873792235505806.waf 
(deleted)
java      14127    kereno  261u      REG              253,2 80167567360   
39977041 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$67066219710704159904.waf 
(deleted)
java      14127    kereno  262r      REG              253,2 80167567360   
39977041 /lv_scratch/kereno/sens/io/JID$18$CDID$1$0$67066219710704159904.waf 
(deleted)

Original issue reported on code.google.com by ker...@gmail.com on 9 Jun 2015 at 3:15

GoogleCodeExporter commented 8 years ago
Updating with answers known so far:
- Loading the exact same dataset on the same machine (sensorium-11) was 
successful (months ago). What could have changed??
==> You weren't in a hurry then - it wouldn't have been as much of a problem 
then. It decided to wait. :-) (No way to really know the answer.)
- Would loading the data from several machines help? (Currently splitting the 
input file and sending it to several nodes)
==> Yes because the materialization is happening at the sender side. If you 
split the data 10 ways instead of 1 way these files will be 1/10 as big per 
node.
- Why are WAF files generated on pre-sorted data (from a single source) and not 
simply loaded?
==> This is (as mentioned by Yingyi) to prevent possible deadlocks that could 
happen otherwise due to flow control during the hash-partitioned merge process. 
 Materializing provides a "spring buffer" that allows the sender and receivers 
to pace themselves as needed.
- Why is AsterixDB holding on to these files until you stop the instance and 
only then releases the disk space?
==> As Young-Seok said, this shouldn't be happening - it's the failure and bad 
cleanup that's causing this problem, apparently.

Original comment by dtab...@gmail.com on 9 Jun 2015 at 9:45