Open shubhamagarwal92 opened 6 years ago
Hi
I am following the steps provided here to train my model.
I have pre-processed the datapack. But when I am trying to "Build Data Structures and extract anchor text", I am having this GC overhead issue.
I have even increased the MAPRED and HADOOP memory to 15G and even provided opts for Dmapreduce.reduce.java.opts and Dmapreduce.reduce.memory.mb
My system has 8 cores 32 GB, using java 8. This is the snippet of command that I am following.
hadoop \ jar target/FEL-0.1.0-fat.jar \ com.yahoo.semsearch.fastlinking.io.ExtractWikipediaAnchorText \ -Dmapreduce.map.env="JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64" \ -Dmapreduce.reduce.env="JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64" \ -Dyarn.app.mapreduce.am.env="JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64" \ -Dmapred.job.map.memory.mb=15144 \ -Dmapreduce.map.memory.mb=15144 \ -Dmapreduce.reduce.memory.mb=15144 \ -Dmapred.child.java.opts="-Xmx15g" \ -Dmapreduce.map.java.opts='-Xmx15g -XX:NewRatio=8 -XX:+UseSerialGC' \ -Dmapreduce.reduce.java.opts="-Xmx15g -XX:NewRatio=8 -XX:+UseSerialGC" \ -input wiki/${WIKI_MARKET}/${WIKI_DATE}/pages-articles.block \ -emap wiki/${WIKI_MARKET}/${WIKI_DATE}/entities.map \ -amap wiki/${WIKI_MARKET}/${WIKI_DATE}/anchors.map \ -cfmap wiki/${WIKI_MARKET}/${WIKI_DATE}/alias-entity-counts.map \ -redir wiki/${WIKI_MARKET}/${WIKI_DATE}/redirects
Could you please suggest why this might be happening?
Pardon me as I am novice to hadoop and java
@aasish Could you please comment as to how should I resolve this?
FYI, I solved the issue with this shell script. README needs to be updated.
Hi
I am following the steps provided here to train my model.
I have pre-processed the datapack. But when I am trying to "Build Data Structures and extract anchor text", I am having this GC overhead issue.
I have even increased the MAPRED and HADOOP memory to 15G and even provided opts for Dmapreduce.reduce.java.opts and Dmapreduce.reduce.memory.mb
My system has 8 cores 32 GB, using java 8. This is the snippet of command that I am following.
Could you please suggest why this might be happening?
Pardon me as I am novice to hadoop and java