GiraffaFS / giraffa

Giraffa FileSystem (Slack: giraffa-fs.slack.com)
https://giraffa.ci.cloudbees.com
Apache License 2.0
17 stars 6 forks source link

Implement a way to load FSImage into HBase #27

Closed shvachko closed 9 years ago

shvachko commented 9 years ago

Original issue 27 created by shvachko on 2012-09-14T15:33:35.000Z:

We want to provide an easy transition method between HDFS and Giraffa. A nice solution would be to provide a way to load the FSImage, editLogs, etc, into Giraffa.

shvachko commented 9 years ago

Comment #1 originally posted by shvachko on 2012-09-14T15:41:07.000Z:

I guess this is also a good place to ask -- should we also consider generating an FSImage from Giraffa? Would it even be useful to provide that feature?

shvachko commented 9 years ago

Comment #2 originally posted by shvachko on 2012-09-25T07:20:03.000Z:

<empty>

shvachko commented 9 years ago

Comment #3 originally posted by shvachko on 2012-09-25T07:45:19.000Z:

We can do it using an extension of OIV (offline image viewer) and OEV (offline edits viewer). These tools can read image and edits and we (giraffa) need to provide table generation.

shvachko commented 9 years ago

Comment #4 originally posted by shvachko on 2012-12-05T02:04:22.000Z:

Here it is as a unit test!

It uses DFSTestUtils to create files in HDFS -- then it reads the FSImage right after a saveNamespace() call in OIV and recreates it in HBase and has Giraffa check it using DFSTestUtils again.

shvachko commented 9 years ago

Comment #5 originally posted by shvachko on 2012-12-05T08:57:11.000Z:

So the procedure for upgrading is to

  1. HDFS: safemode enter
  2. HDFS: savenamespace
  3. HDFS: stop-dfs.sh
  4. Giraffa: format
  5. Giraffa: start-giraffa.sh with same DNs as in HDFS
  6. Giraffa: call OIV on the fsimage (the latest checkpoint created by savenamespace)

We should create a startup option -importHDFS for the last step. So that when giraffa starts with -importHDFS it populates the namespace before getting online. Ideally we should be able to do 4-5 in one shot, like in: start-giraffa -format -importHDFS /hdfs/name/current/fsimage Then add a wiki page describing the upgrade procedure.

shvachko commented 9 years ago

Comment #6 originally posted by shvachko on 2012-12-05T18:13:53.000Z:

Prior to that though, we need to address that we currently store List in GRFA. We need to split blocks from their DataNodeLocation information in the Namespace table and somehow get Giraffa to populate the DataNodeLocation information through the DataNodes.

If you look at the unit test source code you will see I have a section where I grab LocatedBlocks directly from NameNode and use that:

private long parseBlocks(int numOfBlocks, List blocks, BufferedReader br, String path) throws IOException { long totalLength = 0; for(int i = 0; i < numOfBlocks; i++) { String blockLine = br.readLine(); assert blockLine.equals(" BLOCK"); long blockID = Long.parseLong(br.readLine().replace(" BLOCK_ID = ", "").trim()); long blockLength = Long.parseLong(br.readLine().replace(" NUM_BYTES = ", "").trim()); long genStamp = Long.parseLong(br.readLine().replace(" GENERATION_STAMP = ", "").trim()); totalLength += blockLength; } MiniDFSCluster dfsCluster = UTIL.getDFSCluster(); LocatedBlocks lbs = dfsCluster.getNameNode().getBlockLocations(path, 0, totalLength); blocks.addAll(lbs.getLocatedBlocks()); return totalLength; }

This is how I currently get away with passing the unit test. We need to make it so Giraffa can discover the DataNodeLocations itself before this is really complete. That is a separate issue.

shvachko commented 9 years ago

Comment #7 originally posted by shvachko on 2012-12-05T18:26:01.000Z:

Perhaps we can commit the test and open up new issue(s) for the other work?

shvachko commented 9 years ago

Comment #8 originally posted by shvachko on 2012-12-06T03:27:26.000Z:

I see. This will probably take a few issues. Because even when we split Block from its locations, we will need to add the functionality of populating locations from the BlockManager. Let's commit the test for now.

shvachko commented 9 years ago

Comment #9 originally posted by shvachko on 2012-12-06T05:22:32.000Z:

Committed the test. (I'm assuming that was a +1); lemme know if I should revert.