Open ghost opened 6 years ago
Do you get the desired results when you try to use Tabulas integrated CLI?
@criztovyl Yes I checked,but it is not giving the desired results at all times.
But I am only able to extract tables from a file, that has ruled tables.
Tabula can only handle ruled tables, without them it can't do it's job, therefore I think this is not an issue with Tabula.
I took a look at your Random Numbers file anyway and the tables seem to be text, separated by tabs and/or other white space. Now the question is what data you want: Are the numbers itself enough or do you need them to be in a table? (I suspect you need a table b/c Tabula is for tables.)
If you only need the numbers you could use pdftotext
(Debian has it in poppler-utils
) and delete all the lines you do not need.
If you need a table with a bit of scripting in your language of choice you should be able to construct tables from the extracted text.
I am working with this Tabula Api.I am writing the code in java to extract the tables from any pdf using this API.I tried my code on several files. But I am only able to extract tables from a file,that has ruled tables.I tried using both SpreadsheetExtractionAlgorithm and BasicExtractionAlgorithm,but none of them produced the desired results.I am sharing a sample pdf file in which I am unable to detect the tables,and the source code I have written in Java to extract the tables(the file is in txt format as it cannot be submitted here in Java). TableConverter.txt abc.pdf