Closed vaishaliarora277 closed 4 years ago
Using the dictionary disease
for ami search
, I also had the same error mentioned by @vaishaliarora277 for my corpus 950.
I am assuming that you are running something like:
ami -p myproject950 search disease ..
Do you get a list of successful documents (PMCddddddd)?
try :
<cd to project directory - i.e. where the 950 files are>
ls -l . PMC*/scholarly.html | wc
This should tell you how many transformations worked.
`ls -
The new logger from Remko may help on this.
Do you have a file logs/ami.log
in your filestore (probably in your project directory?
@petermr, I was running ami -p miniprojectfunders search --dictionary funders
. I don't have logs/ami.log
file in my CProject directory named miniprojectfunders.
Yes, I did got a list of 952 items in my directory.
OS: Windows 10
C:\Users\me>miniprojectfunders 1s -1 . PMC*/scholarly.html | wc
'miniprojectfunders' is not recognized as an internal or external command,
operable program or batch file.
Thanks!
On Fri, Jul 3, 2020 at 5:07 PM VAISHALI ARORA notifications@github.com wrote:
@petermr https://github.com/petermr, I was running ami -p miniprojectfunders search --dictionary funders. I don't have logs/ami.log file in my CProject directory named miniprojectfunders. Yes, I did got a list of 952 items in my directory. OS: Windows 10
C:\Users\me>miniprojectfunders 1s -1 . PMC*/scholarly.html | wc 'miniprojectfunders' is not recognized as an internal or external command, operable program or batch file.
You have an unwanted word in the command. and also two of your "els" are "ones" 1s -1 . PMC/scholarly.html | wc should be ls -l . PMC/scholarly.html | wc
We all make this mistake!! It's very difficult to distinguish "el" from "one" in some fonts
Should be: 1s -1 . PMC*/scholarly.html | wc
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/openVirus/issues/73#issuecomment-653609867, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCSY7XIV4J7VR7OJC6PDRZX65ZANCNFSM4ONOFEGQ .
-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
In command prompt, I gave the command cd mpc
(mpc is the directory where my 950 files are) and then I tried the command
ls -l . PMC*/scholarly.html | wc
The output was
'ls' is not recognized as an internal or external command, operable program or batch file.
So I tried the same commands in Git Bash and got the following some numbers as output.
@petermr Is the above output correct?
I also tried setting the environment variable MAVEN_OPTS = -Xmx512m -XX:MaxPermSize=128m
as per
also see https://cwiki.apache.org/confluence/display/MAVEN/OutOfMemoryError
and gave the command ami -p mpc search --dictionary disease
. The output was
[...]
Caused by: java.lang.OutOfMemoryError: Java heap space
569564 [main] DEBUG org.contentmine.cproject.args.DefaultArgProcessor - exception in option: or --transform; (1,2147483647); parseTransform; STRING: null / []; nlm2html; [nlm2html]
569564 [main] DEBUG org.contentmine.cproject.args.DefaultArgProcessor - exception in option: or --transform; (1,2147483647); parseTransform; STRING: null / []; nlm2html; [nlm2html]
[...]
@petermr Is there any change should I do in the environment variable?
Thanks @petermr,
I first entered this in the Command prompt :set MAVEN_OPTS=-Xmx512m -XX:MaxPermSize=128m
I again ran this : C:\Users\me>ami -p miniprojectfunders search --dictionary funders
and got :
+++++++++++++++++++running: search; search([funders])[]
279807 [main] DEBUG org.contentmine.ami.plugins.CommandProcessor -
+++++++++++++++++++running: search; search([funders])[]
..............................................
large document (1507) for PMC6824115 truncated to 500 sections
.......................................................................................................
I got no search tables for dictionary funders, so next, I deleted this large file PMC6824115 from the directory and again run the same command:
C:\Users\me>ami -p miniprojectfunders search --dictionary funders
This time I got full data tables in my directory with complete search for dictionary funders.
https://photos.google.com/search/_tra_/photo/AF1QipPM1Mytn-__zViXjfugVKIslmzYWMYp9RPEHv-2
Thanks - this is very clear.
We'll take it in bits:
In command prompt, I gave the command cd mpc (mpc is the directory where my 950 files are) and then I tried the command ls -l . PMC*/scholarly.html | wc The output was 'ls' is not recognized as an internal or external command, operable program or batch file.
PMR> maybe something is wrong with your PATH
Try ls
or
which ls
("which" tells you where the ls program is).
If you get "ls" working you probably want: either ls .
(list all files in current directory) OR
ls PMC*/scholarly.html
list the scholarly.html childrens of PMC* files.
Let's try to solve that and then move to the OOM error.
-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
Well done.
I'll create an FAQ and you can answer it!
On Sat, Jul 4, 2020 at 9:49 AM VAISHALI ARORA notifications@github.com wrote:
Thanks @petermr https://github.com/petermr, I first entered this in the Command prompt : set MAVEN_OPTS=-Xmx512m -XX:MaxPermSize=128m I again ran this : C:\Users\me>ami -p miniprojectfunders search --dictionary funders and got :
+++++++++++++++++++running: search; search([funders])[] 279807 [main] DEBUG org.contentmine.ami.plugins.CommandProcessor - +++++++++++++++++++running: search; search([funders])[] .............................................. large document (1507) for PMC6824115 truncated to 500 sections .......................................................................................................
I got no search tables for dictionary funders, so next, I deleted this large file PMC6824115 from the directory and again run the same command:
C:\Users\me>ami -p miniprojectfunders search --dictionary funders
This time I got full data tables in my directory with complete search for dictionary funders.
https://photos.google.com/search/_tra_/photo/AF1QipPM1Mytn-__zViXjfugVKIslmzYWMYp9RPEHv-2
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/openVirus/issues/73#issuecomment-653740399, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS7YWHRUWY6LI7QJWJDRZ3UH5ANCNFSM4ONOFEGQ .
-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
Thanks @petermr Sure, I'll do that.
@petermr I gave the command which ls
in command prompt and the output was :
'which' is not recognized as an internal or external command, operable program or batch file.
So I tried in git and successfully got the path of ls
.
But for using in command prompt, I found an equivalent command DIR for ls
from https://skimfeed.com/blog/windows-command-prompt-ls-equivalent-dir/#:~:text=Answer%3A%20Type%20DIR%20to%20show,commands%20and%20their%20Windows%20equivalents.
I used DIR in command prompt and the following output was obtained for a small directory
For my corpus containing 950 articles, I gave the command DIR mpc
in command prompt and the output was
[...]
01/07/2020 03:44 PM <DIR> PMC7310742
01/07/2020 03:44 PM <DIR> PMC7312578
01/07/2020 03:44 PM <DIR> PMC7314749
01/07/2020 03:44 PM <DIR> PMC7316228
[...]
Next, when I gave the command DIR PMC*/scholarly.html
, the output was
Parameter format not correct - "scholarly.html".
Is there any change, I could do?
Well done.
On Sat, Jul 4, 2020 at 5:18 PM Lakshmi Devi Priya notifications@github.com wrote:
@petermr https://github.com/petermr I gave the command which ls in command prompt and the output was : 'which' is not recognized as an internal or external command, operable program or batch file. So I tried in git and successfully got the path of ls.
I forgot you were on Windows!
which
does not exist there. (We are going to remind each other which Operating system we are on).But for using in command prompt, I found an equivalent command DIR for ls from https://skimfeed.com/blog/windows-command-prompt-ls-equivalent-dir/#:~:text=Answer%3A%20Type%20DIR%20to%20show,commands%20and%20their%20Windows%20equivalents .
Well done.
I used DIR in command prompt and the following output was obtained for a small directory [image: lsdir] https://user-images.githubusercontent.com/65600695/86516249-a7c14d80-be3c-11ea-9f31-423b5edb0968.PNG
For my corpus containing 950 articles, I gave the command DIR mpc in command prompt and the output was
[...] 01/07/2020 03:44 PM
PMC7310742 01/07/2020 03:44 PM PMC7312578 01/07/2020 03:44 PM PMC7314749 01/07/2020 03:44 PM PMC7316228 [...] Good
Next, when I gave the command DIR PMC*/scholarly.html , the output was Parameter format not correct - "scholarly.html".
Is there any change, I could do?
This is another difference between Windows and Unix - they use backslash.
try
DIR PMC*\scholarly.html
—
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/openVirus/issues/73#issuecomment-653784950, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS7KSVNXDYPXLRQBW6TRZ5I5DANCNFSM4ONOFEGQ .
-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
For viewing the html files, first I gave the command cd mpc
(mpc is the 950 articles directory) in command prompt and then I gave the command DIR PMC*\scholarly.html
and the output was
The filename, directory name, or volume label syntax is incorrect,
@petermr I checked the filename and directory name. Is there anything I should do about the volume label?
I don't run windows. So I forget it doesn't expand the *.
Maybe you can run powershell. Clyde Davies knows how.
On Sun, Jul 5, 2020 at 5:17 AM Lakshmi Devi Priya notifications@github.com wrote:
For viewing the html files, first I gave the command cd mpc (mpc is the 950 articles directory) in command prompt and then I gave the command DIR PMC*\scholarly.html and the output was
The filename, directory name, or volume label syntax is incorrect,
@petermr https://github.com/petermr I checked the filename and directory name. Is there anything I should do about the volume label?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/openVirus/issues/73#issuecomment-653840064, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS2I5WRM67RQXT3W3ADRZ75HFANCNFSM4ONOFEGQ .
-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
@petermr I tried the command DIR PMC*\scholarly.html
, in Windows PowerShell. It worked and showed the following output :
Directory: C:\Users\Admin\Desktop\mpc\PMC5764404
Mode LastWriteTime Length Name
---- -------------- ------ -----
-a---- 04/07/2020 11:11 AM 124852 scholarly.html
[...]
Well done.
On Sun, Jul 5, 2020 at 11:55 AM Lakshmi Devi Priya notifications@github.com wrote:
@petermr https://github.com/petermr I tried the command DIR PMC*\scholarly.html, in Windows PowerShell. It worked and showed the following output :
Directory: C:\Users\Admin\Desktop\mpc\PMC5764404
Mode LastWriteTime Length Name
-a---- 04/07/2020 11:11 AM 124852 scholarly.html [...]
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/openVirus/issues/73#issuecomment-653873291, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS6DLJG7BSXKOZE2N2LR2BLZHANCNFSM4ONOFEGQ .
-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
Closed as part of the learning process
Using
Amisearch
for a corpus of 950 articles showing anOutOfMemoryError
when searched for the dictionary, shoeed the following error: ....